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Abstract 


The  third  U.S.  Army  Conference  on  Applied  Statistics  was  hosted  by  George  Mason  University 
(GMU)  during  22-24  October  1997  at  the  recently  opened  Johnson  Center  on  campus.  The 
conference  was  cosponsored  by  the  U.S.  Army  Research  laboratory  (ARL),  the  U.S.  Army  Research 
Office  (ARO),  the  U.S.  Military  Academy  (USMA),  the  U.S.  Army  Training  and  Doctrine  Command 
(TRADOC)  Analysis  Center-White  Sands  Missile  Range,  the  Walter  Reed  Army  Institute  of 
Research  (WRAIR),  and  the  National  Institute  for  Standards  and  Technology  (NIST).  The  U.S. 
Army  Conference  on  Applied  Statistics  is  a  forum  for  technical  papers  on  new  developments  in 
statistical  science  and  on  the  application  of  existing  techniques  to  Army  problems.  This  document 
is  a  compilation  of  available  papers  offered  at  the  conference. 


FOREWORD 


The  third  U,S.  Army  Conference  on  Applied  Statistics  was  hosted  by  George  Mason  University  (GMU) 
during  22-24  October  1997  at  the  recently  opened  Johnson  Center  on  campus.  The  conference  was 
cosponsored  by  the  U.S.  Army  Research  Laboratory  (ARL),  the  U.S.  Army  Research  Office  (ARO),  the 
U.S.  Military  Academy  (USMA),  the  U.S.  Army  Training  and  Doctrine  Command  (TRADOC)  Analysis 
Center-White  Sands  Missile  Range  (TRAC-WSMR),  die  Walter  Reed  Army  Institute  of  Research 
(WRAIR),  and  the  National  Institute  for  Standards  and  Technology  (NIST).  The  U.S.  Army  Conference 
on  Applied  Statistics  is  a  forum  for  technical  papers  on  new  developments  in  statistical  science  and  on 
the  application  of  existing  techniques  to  Army  problems.  The  Army  faces  far-ranging  challenges  that 
encompass  many  topics  in  which  probability  and  statistics  have  a  contribution  to  make.  The  purpose 
of  this  conference  is  to  promote  the  practice  of  statistics  in  the  solution  of  these  diverse  Army  problems. 

The  third  conference  was  preceded  by  a  short  course,  “Virtual  Reality  and  Scientific  Visualization,” 
given  by  Professors  Edward  Wegman  and  Dan  Carr.  Several  opportunities  were  afforded  conference 
participants  to  see  and  work  with  die  scientific  visualization  tools  at  GMU.  Distinguished  speakers 
spoke  during  invited  general  sessions;  Prof.  Thomas  Hettmansperger,  Penn  State  University; 
Prof.  Jeffrey  Birch,  Virginia  Tech;  Prof.  Lyle  Ungar,  University  of  Pennsylvania;  Prof.  Judea  Pearl, 
University  of  California,  Los  Angeles;  Prof.  Donald  Berry,  Duke  University,  Prof.  Carey  Priebe,  Johns 
Hopkins  University;  Prof  J.  David  Cooke,  University  of  Western  Ontario;  and  Dr.  Eric  Lagergrai,  NIST. 
Topical  methodological  areas  included,  for  example,  nonparametric  methods,  experimental  design, 
neural  networks,  fuzzy  control,  statistical  process  control,  genetic  algorithms,  robust  design,  density 
estimation,  and  new  advances  in  statistical  software  for  data  visualization.  Application  areas  included, 
for  example,  the  digitized  batdefield,  animation  of  dust  behavior  for  incorporation  in  simulation  models, 
communication  among  robot  scouts  on  the  batdefield,  chemical  discrimination,  parts  inventory  control, 
camouflage  effectiveness,  terrain  modeling,  and  mine-field  detection.  A  highlight  of  the  conference  was 
the  awarding  of  the  Army  Wilks  award  to  William  Jay  Conover  of  Texas  Tech  for  years  of  substantive 
contribution  to  statistical  methodology  and  the  practice  of  statistics  in  the  Army. 

The  Executive  Board  for  die  conference  recognizes  Dr.  Robert  Launer,  ARO,  Dr.  Marie  Vangel,  NIST, 
and  Mr.  David  Webb,  ARL,  for  assisting  with  conference  details;  Dr.  Jock  Grynovicki  for  oversite  of 
the  Army  Wilks  Award;  Dr.  Barry  Bodt,  ARL,  for  general  conference  administration  and  proceedings; 
and  Prof.  Edward  Wegman,  GMU,  for  hosting  the  conference  and  handling  all  local  arrangements. 
Special  thanks  are  due  Ms.  Patricia  Joyce,  GMU,  who  assisted  in  handling  many  on-site  details. 
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ABSTRACT 

The  purpose  of  dris  report  is  to  provide  the  results  of  a  calibration  study  performed  by  the  the  Human 
Researdi  and  Engineering  Directorate  (H^D)  in  accordance  with  a  Technical  Program  Annex  (TPA)  agreement 
between  the  Natick  Research,  Development,  and  Engineering  Centw  (NRDEC)  and  HRED.  This  report  describes 
the  objectives,  methodological  approach,  apparatus,  e^erimental  design,  data  collection  procedures,  and  the 
results  of  the  study. 


INTRODUCTION 

The  human  perception  of  camouflaged  targets  is  influenced  by  a  number  of  variables.  Examples  include 
the  type  of  camouflage  pattern,  the  type  of  background,  the  density  of  the  background,  as  well  as  Ae  type  and 
directi<m  of  illumination.  Typically,  evaluations  of  prototype  camouflage  pattams  are  poformed  in  field  trials 
through  file  use  of  numerous  groups  of  observers  w4io  rate  flie  effectiveness  of  various  patterns  againgf  different 
backgrounds.  Large  field  studies  are  difBcuIt  to  conduct  as  th^  involve  the  problem  of  finding  suitable  test  sites, 
the  perfcamance  of  many  test  observations,  unpredictable  weather,  and  variations  in  foliage  resulting  fron  the 
change  of  seasrais.  Hence,  the  Natick  Research,  Develqjment  and  Engineering  Center  (NRDEC)  developed  the 
Mobile  Army  Camouflage  Evaluaticm  (MACE)  system  as  a  cost  effective  means  of  automating  the  process  of 
evaluating  camouflage  effectiveness.  The  baiefit  of  this  research  for  the  Department  of  Defense  is  to  reduce  the 
time  and  cost  of  field  testing  existing  and  proposed  camouflage  patterns  by  providing  a  portable  (xanputerized 
system  that  simulates  human  perception  and  judgmait  of  camouflage  efifertiveness  in  field  settings. 

The  MACE  syst^  has  the  capability  of  providing  the  U.S.  Army  with  a  field-portable  device  to 
objectively  evaluate  candidate  camouflage  oisembles.  The  system  assesses  camouflage  effectiveness  by  using 
optically-filtered  and  digitized  video  images  tiiat  are  subsequently  processed  by  computer  algorithms  in  an  attempt 
to  model  human  visual  perception  as  it  relates  to  camouflage  phenomena.  To  caliteite  the  MACE  device,  flie 
Human  Research  and  Engineering  Directorate  (HRED)  of  the  Army  Research  Laboratory  (ARL)  assisted  NRDEC 
personnel  in  conducting  a  caliN-ation  study  This  report  presents  the  MACE  measurements  correlated  with  the 
soldiers'  ratings  of  camouflage  effectiveness  hi  order  to  calibrate  MACE  measures  as  predictors  of  soldio' 
performance. 

Overview 

The  MACE  s}'stem  cmisists  of  hardware  and  software  for  analyzing  digitized  video  images  of 
camouflaged  targets  and  their  backgrounds.  The  purpose  of  the  MACE  system  is  not  to  search  fw  targets,  but  to 
measure  Ae  camouflage  effectiveness  of  a  target  against  its  badcground  simulating  human  perception.  The  systan 
collects  18  basic  optical  measures  (Kilian,  1992)  of  a  target  and  its  background.  Each  of  these  optical  measures 
consists  of  three  levels  of  measurement  related  to  the  CIELAB  color  description  coordinates  as  standardized  by  file 
Commission  Intanationale  de  L'Eclairage  of  1931.  These  levels  are  designated  by  "L*"  (lightness  level),  "A*" 
(red-green  scale  value),  and  "B*"  (yellow-blue  scale  value).  Consequmtly,  the  total  number  of  measures  or 
variables  acquired  for  a  single  MACE  system  observation  is  54  (i.e.,  18  basic  optical  measures  x  3  levels  L,  A,  & 
B).  Howeva,  presently,  the  54  variables  are  treated  as  having  equal  importance  hi  predicting  human  performance. 
It  is  likely  that  thae  are  diffaences  in  the  predictive  values  of  these  various  variables.  Consequently  empirical 
data  had  to  be  obtained  to  determine  these  diffaential  weights. 
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Objective 

The  objective  of  the  study  was  to  calit«rate  the  MACE  system  by  determining  the  relative  importance  of 
each  of  the  54  MACE  variables  in  predicting  human  subjective  ratings  of  camouflage  effectiveness.  Specifically, 
weighting  coefficients  for  MACE  system  variables  will  be  based  upon  soldier  ratings  of  the  effectiveness  of  five 
different  camouflage  patterns  correlated  with  the  MACE  system  variables  of  the  target  ensemble  patterns  and  their 
associated  backgrounds. 

Additionally,  the  study  sought  to  quantify  the  effect  of  the  direction  of  illumination  on  the  human  observer 
ratings  of  camouflage  effectiveness  of  the  candidate  ensembles.  The  research  hypothesis  was  that  there  would  be 
differences  in  the  human  judgments  of  the  camouflage  effectiveness  of  the  various  ensembles  'Mien  the  camouflage 
targets  and  backgroimd  are  front  lit  versus  back  lit.  The  statistical  null  hypothesis  was  that  there  would  be  no 
difference  in  human  judgments  Mien  the  camouflage  targets  and  background  are  front  lit  versus  back  lit 


METHOD 

Apparatus 

MACE  System-ThQ  MACE  system  (Kilian,  1992)  studied  used  a  video  camera,  a  series  of  tai  optical 
filters,  and  a  computer  to  digitally  stcn-e  an  image  containing  the  camouflaged  target  and  its  background.  Eadi  of 
file  ten  filters  sampled  a  different  •wavelength  from  the  electrmnagnetic  spectrum  (380  nm  to  740  nm).  For  each 
image  stored,  an  outline  of  the  camouflaged  target  portion  of  the  image  was  made  and  then  the  digitized  pixel 
information  Mthin  the  target  outline  was  stored  separately  from  the  background  information  pmtion.  Next,  54 
variables  were  computed  that  were  thought  to  differentiate  the  target  from  the  background.  These  variables  were 
based  on  features  such  as  light  reflectance,  spectral  wavelength,  and  texture.  During  the  study,  the  MACE 
equipment  was  operated  and  maintained  by  NRDEC  persotmel. 

Targets— The  calibration  study  compared  five  candidate  camouflage  patterns.  The  patterns  were 
woodland,  olive  drab,  desert,  solid  gray,  and  black.  Though  one  would  expect  the  woodland  pattern  to  be  most 
effective  against  a  wooded  background,  the  traditional  olive  drab  continues  to  show  its  concealment  value  unda* 
certain  conditions  (Bensel,  Ramsley,  &  Bushnell,  1977).  The  solid  gray  ensemble  has  beat  prqmsed  for  use  in 
built-up  and  urban  areas  but  may  also  be  effective  in  wooded  situations.  The  desert  pattern  may  also  be  effective  in 
the  woods  Miere  there  is  sparse  foliage  and  bright  sunlight  Mth  a  dried  gras^  background.  The  black  pattan, 
intended  for  night  use,  may  also  be  effective  in  dense  and  shaded  wooded  areas. 

Target  Backgrounds— The  backgrounds  of  the  target  ensembles  consisted  of  homogeneous  grass,  trees, 
and  their  associated  foliage.  The  tree  trunks  and  branches  were  brownish-gray  Miile  the  foliage  consisted  of 
clusters  of  mostly  green  leaves. 

Observation  Stations— An  observation  station,  consisting  of  10  individual  observer  cubicles,  was  provided 
for  the  test  participants  to  view  the  pairs  of  camouflage  oisembles.  The  individual  observation  cubicles  were  used 
(as  opposed  to  a  group  observation  and  judgment  situation)  to  obtain  independent  test  participant  judgments  of  the 
camouflage  effectiveness  of  the  ensembles.  The  test  participants  sat  in  the  cubicles  Miich  blocked  their  vision 
down  range.  Whm  signaled,  the  test  participants  stood  to  observe  the  camouflage  target  ensonble  pair  for  10 
seconds  and  flien  were  seated. 

Test  Site  Layout  —The  overall  test  site  layout  is  presented  in  Figure  1.  The  Figure  shows  a  distance  of  30 
meters  from  the  MACE  apparatus  and  the  test  observers  to  the  target  camouflage  ensembles.  The  distance  between 
the  observer  and  the  target  was  considered  critical  to  the  calibration  process.  It  is  a  generally  accepted  principle 
that  the  details  of  an  image  of  an  object  are  less  discriminated  as  the  distance  betweoi  the  observe  and  the  object 
are  increased.  LikeMse,  the  MACE  system  resolutimi  decreases  as  a  fimction  of  distance.  Preliminary 
investigations  at  Natick  diowed  a  target  filled  one-half  of  the  vertical  extent  of  the  MACE  computer  monitor 
height  Mien  the  MACE  to  target  distance  was  approximately  13  meters.  It  did  not  seem  r^onable  to  expect  the 
MACE  system  to  provide  optimal  data  when  flie  target  height  •was  less  than  20%  of  the  MACE  display  height  The 
size  of  file  target  area  that  is  represented  by  one  MACE  camera  pixel  is  approximately  tme  centimeter  at  20  meters 
and  two  centimeters  at  40  meters.  However,  at  a  distance  of  40  meters  and  beyond,  the  target  height  would  be  less 
than  20%  of  the  MACE  display  height  and  so  small  that  only  large  areas  of  contrast  would  be  discriminated.  Thus, 
it  was  decided  that  a  distance  of  30  meters  be  used  to  provide  a  level  of  sensitivity  at  ■which  the  MACE  system 
could  adequately  discriminate  the  detail  that  is  found  in  woodland  and  desert  camouflage  patterns. 
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Figure  1.  Diagram  of  test  area  (overhead  view)  showing  relationship  of  the  left  and  right  camouflage 
patterned  target  ensemble  positions  (TPs)  to  the  MACE  system  and  flie  ten  observation  posititms  (OPs) 
backgrounds.  The  view  of  the  target  ensembles  and  their  backgrounds,  as  seen  by  the  test  observers  and  the 
MACE  system  video  camera,  is  shown  in  Figure  2. 


Left  and  right  target  ensembles,  spaced  10  meters  apart,  were  presented  to  correspond  with  two  tree 
foliage  backgrounds  that  were  homogeneous.  Front  lit  illumination  occurred  uiten  the  sun  position  at  the  test  site 
(see  Figure  1)  was  in  ftie  west  and  the  resulting  shadows,  from  the  observers  viewpoint,  were  bdiind  ftie  target 
patt^s  and  the  corresponding  backgrounds.  Back  lit  illumination  occurred  vfren  the  sun  was  in  ftie  east  and  ftie 
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resulting  shadows  were  in  front  of  the  target  patterns  and  backgrounds.  The  view  of  the  target  ensembles  and  then- 
backgrounds,  as  seen  by  the  test  observers  and  the  MACE  ^tem  video  camera,  is  shown  in  Figure  2. 


Figure  2.  The  diagram  shows  a  simulated  MACE  system  computer  video  image  display  of  two  target  camouflage 
ensembles  with  similar  foliage  backgrounds.  The  vdiite  outline  around  each  target  shows  die  background 
area  of  interest  (AOI)  that  was  used  for  acquiring  the  data  to  compute  the  MACE  system  variables'  values. 
A  reflectance  board  with  eight  light  levels  was  used  to  standardize  the  MACE  video  camera  input. 


Test  Participants 

Ten  male  enlisted  U.S.  Army  soldiers  (ages  ranging  from  19  to  35  years)  participated  as  judges  of  the 
camouflage  effectiveness  offered  by  the  five  camouflage  patterns.  Each  candidate  obsa-ver  was  tested  fiw  visual 
acuity  and  color  deficiency.  Only  candidates  whose  visual  acuity  was  measured  at  20/20  or  better  (corrected  or 
uncorrected)  were  selected  to  participate  in  the  stiufy. 

Test  Participant  Procedure 

Each  test  participant  was  given  a  response  sheet  at  the  beginning  of  each  trial.  The  test  participant  placed 
a  check  mark  in  one  box  of  a  sequence  of  seven  boxes  corresponding  to  a  7-point  rating  scale  (l=least  blending, 
4=moderate  blending,  7=most  blending)  that  indicated  the  degree  tiiat  he  perceived  that  the  left-sided  target 
camouflage  ensemble  pattern  blended  with  the  badcground.  Similarly,  he  rated  the  right-sided  target  ensemble 
pattern. 
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I^CE  Data  Acquisition  Procedure 

The  MACE  system  data  was  obtained  simultaneously  with  the  collection  of  the  test  participant  data.  A 
reflectance  board  was  located  at  the  base  of  the  target  camouflage  ensembles  to  calibrate  the  MACE  apparatus  for 
standardized  video  data  input.  If  necessary,  MACE  filters  were  changed  to  adjust  for  flte  variation  in  lighting 
conditions  at  the  time  of  the  data  acquisition. 

Experimental  Methodology 

Experimental  CondiUons-Dxe  experimental  conditions  for  the  study  were  the  type  of  camouflage 
(Woodland,  Olive  Drab,  Desert,  Solid  Gray,  and  Black),  sun  illumination  condition  (Front  lit  and  Back  lit),  and 
target-background  position  (Left  and  Right).  The  same  groiq)  of  soldier  test  participants  was  used  under  all  the 
experimental  treatment  conditions  which  resulted  in  a  repeated  measures  2x5x2  design  (Direction  of 
Illumination  x  Camouflage  Pattern  x  Left-Right  Target-background  Position).  These  experimental  conditions  for 
obtaining  the  soldier  camouflage  effectiveness  judgments  and  MACE  measurements  are  presented  in  Table  1. 

Table  1 

Experimental  Conditions  for  Obtaining  the  Soldier  Camouflage  Effectiveness  Judgments 
and  MACE  System  Measurements 


Direction  of  Illumination 

Front-Lit 

Back-Lit 

Camouflage  Pattern 

Camouflage  Pattern 

*W  0  D  G  B 

W  0  D  G  B 

R/L  R/L  R/L  R/L  R/L 

R/L  R/L  R/L  R/L  R/L 

♦Camouflage  Codes:  W=Woodland  0=01ive  Drab 
Position  Codes:  R=Right  L=Left 

I>=Desert  G=Gray  B=Black 

Dependent  Variable~lh&  dependent  variable  was  the  human  camouflage  effectiveness  resptmse  obtained 
fi'om  the  soldier  participant  observer.  This  measure  was  a  rating,  on  a  7-point  scale  (l=least  blending,  4=moderate 
blending,  7=most  blending),  of  the  camouflage  effectiveness  of  each  ensemble  pattern  against  the  background. 

A  paired  comparison  method  was  used  to  obtain  the  human  response  measure  fm  this  study.  The  paired 
comparison  method  is  "a  procedure  in  Miich  objects  are  compared  with  each  other  in  pairs,  eadi  with  eadi,  till  alt 
combinations  are  given”  (English  &  English,  1958).  This  method  was  chosen  because  it  allowed  the  test 
participant  to  cmicentrate  on  the  immediate  percq>tual  task  and  not  rely  (m  monory  as  is  required  by  some  other 
method. 

The  paired  comparisons  of  five  test  ensembles  resulted  in  ten  combinations  of  pairs.  However,  to  contrtd 
for  variations  in  flie  local  background,  each  pair  of  camouflage  ensembles  was  presented  twice;  fliat  is,  once  with  a 
membo’  of  the  paired  ctmibination  viewed  on  the  left  position  and  once  wi&  the  member  viewed  on  the  right 
positimi  (see  Figures  1  &  2).  Because  each  pair  of  ensembles  was  presented  twice  (left  and  right  positicHis)  the 
munbor  of  paired  combinations  was  doubled  to  20  presentations.  The  presentation  of  these  20  pairs  of  oisembles 
was  randomized  within  each  session  to  control  for  biased  responses  that  may  result  from  learned  expectancies 
resulting  from  q^ematic  patterns  of  ensemble  presentations.  The  time  between  left  and  right  presentatimis  was 
approximately  100  minutes.  The  result  of  this  paired  comparison  procedure  allowed  the  computation  of  grotq> 
average  ensemble  pattern  ratings  of  camouflage  effectiveness  from  the  ten  test  participants  for  each  trial  of  the  ten 
ensemble  pair  presentations. 
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Predictor  Variables~Ths  predictor  variables  are  the  MACE  basic  system  variables  related  to  light 
reflectance,  spectral  wavelength,  and  texture.  Each  of  the  basic  variables  resulted  in  three  measures  at  three  levels 
of  analysis  related  to  lightness  (black-white  scale(L)),  and  color;  i.e.,  a  red-greoi  scale  value(A)  and  a  yellow-blue 
scale  value(B).  A  list  of  the  MACE  system  variables  is  presented  in  Table  2.  The  Bhattacharyya  distance  measure 
is  described  by  Fukunaga  (1990).  The  texture  variables  are  based  on  the  reports  by  Haralick,  Shanmugam,  and 
Dinstein  (1973)  and  Ccmnors  and  Harlow  (1980). 


Table! 

List  of  Mobile  Army  Camouflage  Evaluation  (MACE)  System  Variables 


Var.  Stem  LAB 

No.  Abbrev.  Name  of  Basic  Variable  (Prefix) 


Lightness  and  Color  Measures 


1.  MEA  Mean  (pixels) 

2.  VAR  Standard  Deviation 

3.  SKE  Skewness 

4.  KUR  Kurtosis 

5.  MIN  Minimum  Value 

6.  MAX  Maximum  Value 

7.  BAT  Bhattacharyya  Distance 


LMEA 

LVAR 

LSKE 

LKUR 

LMIN 

LMAX 

LBAT 


AMEA 

AVAR 

ASKE 

AKUR 

AMIN 

AMAX 

ABAT 


BMEA 

BVAR 

BSKE 

BKUR 

BMIN 

BMAX 

BBAT 


Texture  Measures 


8.  A2M  Angular  2nd  Moment 

9.  ENM  Entropy 

10.  CSM  Cluster  Shade 

11.  CPM  Cluster  Prominence 

12.  LCM  Linear  Correlation 

13.  IlM  Info  Measure  of  Correlation  I 

14.  I2M  Info  Measure  of  Correlation  II 

15.  SGM  Angular  2nd  Moment  of  SGLDM 

16.  ESM  Entropy  of  SGLDM 

17.  S2M  2nd  Moment  of  SGLDM 

18.  ISM  Inverse  2nd  Moment  of  SGLDM 


LA2M  AA2M  BA2M 
LENM  AENM  BENM 
LCSM  ACSM  BCSM 
LCPM  ACPM  BCPM 
LLCM  ALCM  BLCM 
LIIM  AIIM  BUM 
LEM  AEM  BEM 
LSGM  ASGM  BSGM 
LESM  AESM  BESM 
LS2M  AS2M  BS2M 
LISM  AISM  BISM 


Note.  The  abla-eviation  of  a  MACE  systan  variable  consists  of  a  prefix,  a  stem,  and  a  suffix.  The  prefix  is  the 
letter  L,  A,  or  B  indicating  lightness  (L),  red-green  scale  (A),  or  yellow-blue  scale  (B)  variable.  The  stem  is  a 
three-character  abbreviaticm  of  the  basic  MACE  system  name  (see  Table  2).  The  suffix  designation,  used  later  in 
the  rqxnt,  is  a  me-character  number  indicating  a  quadratic  (2)  or  cubic  (3)  transformation  of  die  MACE  ^em 
variable. 
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Data  Analyses 

Ensemble  Camouflage  Effectiveness  Ratings.  The  10  test  observers  viewed  eadi  of  the  camouflage  ensemble 
pairs  20  times  during  the  course  of  the  10  sessions.  Each  observer  rated  the  camouflage  ensemble,  on  a  7-point 
scale,  for  the  degree  that  the  ensemble  blended  with  the  background.  A  "1"  on  the  scale  indicated  low  camouflage 
efiectiveness,  a  "4"  indicated  moderate  camouflage  effectiveness,  and  a  "7"  indicated  high  camouflage 
effectiveness. 

MACE  Predictor  Weights  and  Ensemble  Camouflage  Effectiveness  Ratings.  The  MACE  system  variables  wa^ 
considared  as  independent  or  "predictor"  variables  and  the  soldier  ensemble  effectiveness  rating  was  considered  as 
the  dq)endent  or  "outcome"  variable.  The  variables  fliat  contributed  most  to  flie  explanatory  variance  were  studied 
to  try  to  understand  the  underlying  psychophysical  processes  involved.  This  subset  of  variables  was  then 
considered  as  a  smaller  set  of  predictor  variables  to  simplify  the  complexity  of  flie  multivariate  interrelationships. 
The  determination  of  the  relative  weighting  coefScients  for  the  MACE  system  Actors  was  accomplished  by 
multiple  regression  analyses.  It  is  acknowledged  that  the  camouflage  effectiveness  ratings  are  ordinal  level 
measurements  as  opposed  to  the  more  desired  interval  level  measurements  for  multiple  regression  analyse. 
However,  at  this  early  stage  of  machine  perc^tion  research,  the  multiple  regressicm  method  was  felt  to  be 
sufficiently  robust  to  allow  the  identiflcation  of  the  relative  importance  of  the  MACE  system  variables. 

In  an  attempt  to  validate  the  multiple  regression  prediction  equations  within  the  parameters  of  die  study,  a 
portion  of  the  data  was  randomly  deleted  and  the  remaining  data  used  to  develop  the  equations.  The  Pearson 
correlation  coefficient  b^een  the  observed  and  predicted  response  was  obtained  to  ascertain  the  reliability  of  the 
weighting  coefficients. 

Experimental  Treatments.  The  mean  ensemble  7-point  camouflage  effectiveness  ratings  fi)r  the  individual 
ensemble  patterns  were  computed  for  the  front  and  back  lit  illumination  conditions  and  the  left  and  right  targ^- 
background  positions.  Tlie  statistical  significance  of  differences  between  the  means  was  detemined  by  analyses  of 
variances.  If  there  wore  no  significant  differences,  then  the  predictor  weights  would  be  assumed  to  be  valid  for  the 
experimental  treatment  conditions  used.  However,  if  there  were  significant  differences,  th^  fiirther  analyses  and 
studies  focusing  on  the  q>ecific  treatment  conditions  would  be  indicated. 


RESULTS 

Effect  of  Camouflage  Pattern,  Light  Direction,  and  Target-Background  Position 

The  raw  data  for  the  54  MACE  ^«tem  variables  and  soldier  observer  ratings  of  camouflage  effectiveness 
were  obtained  from  the  total  of  345  trials.  The  mean  ratings  fw  each  camouflage  patton  are  presented, 
graphically,  in  Figure  4.  The  resulting  rank  order  for  the  five  camouflage  patterns  against  a  woodland  background 
was  (1)  woodland,  (2)  olive  drab,  (3)  black,  (4)  gray,  and  (5)  desert.  The  mean  ratings  fin  each  camouflage  pattern 
were  finther  analyzed  by  the  effect  of  light  directicMi  (front  and  back)  and  target-background  position  (left  and 
right).  These  data  are  presented  graphically  in  Figure  5.  It  can  be  seen  that  there  is  virtualfy  no  effect  of  light 
direcdcm  or  target-background  position  on  the  soldier  respcmse  ratings. 
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Pattern 

Fig;ure4.  Soldio"  raring  of  camouflage  effectiveness.  (l=least  blending,  4=moderate  blending,  7=most  bloiding). 


Left  Right  Left  Right 

Position 


Figure  5.  The  effect  of  the  position  of  the  target  (left  or  right  side)  by  direction  of  lighting  (firat  lit  vs. 
back  lit)  and  by  camouflage  ensemble  pattern  on  the  mean  camouflage  effectiveness  rating. 
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Relative  Importance  of  the  MACE  System  Variables 

In  order  to  determine  the  relative  importance  of  the  MACE  system  variables  in  predicting  the  soldio* 
rating  responses,  the  stepwise  technique  of  the  multiple  regression  analysis  method  was  used  (SPSS  Release  4.0  for 
Sun  4).  Prior  examination  of  the  raw  data  indicated  that  some  of  the  relationships  between  the  MACE  system 
variables  and  the  human  response  variable  (rating)  were  nonlinear;  therefiwe,  quadratic  and  cubic  transfijrmations 
of  the  raw  data  were  also  us^  in  the  subsequent  multiple  regression  analyses.  Thus,  die  54  L,  A,  and  B  variables 
(Table  2)  and  their  quadratic  and  cubic  transformations  resulted  in  a  total  of  162  MACE  system  candidate 
predictor  variables.  When  the  162  candidate  predictor  variables  were  correlated  with  the  soldier  ratings  of 
camouflage  efifectivaiess,  120  of  these  candidate  variables  were  found  to  be  statistically  significant.  Stepwise 
multiple  regression  analyses  considering  the  entire  pool  of  162  candidate  variables  resulted  in  an  optimum  subset 
of  16  predictor  variables  with  a  multiple  R  of  0.95  (F(16)=189.7,  j3<.01).  The  beta  coeflSciaits  show  that  variables 
relating  to  the  overall  lightness  difference  betweoi  the  target  and  background  ccmtributed  most  of  the  e}q>lanatoiy 
variance.  The  next  most  important  variables  were  those  relating  to  the  yellow-blue  color  scale. 

The  stepwise  progression  of  the  multiple  R  is  presented  in  Figure  6.  The  pattern  of  the  growth  curve  of 
the  multiple  R  as  variables  are  selected  by  the  stepwise  method  shows  an  initial  correlation  of  .827,  a  r^id  inaease 
as  the  next  few  variables  are  considered,  and  finally,  a  leveling  of  the  curve  as  the  maximum  number  of 
statistically  significant  variables  is  achieved  (a  multiple  R  of  .95  for  the  16  specific  predictor  variables). 

Obtaining  an  Efficient  Prediction  Model 

To  minimize  the  effect  of  multicollinearity  that  results  fi'tMn  the  use  of  too  many  predictor  variables,  a 
parsimmiious  subset  consisting  of  the  first  6  variables  of  the  16  specific  predictor  variables  was  chosen  as  a 
candidate  model  for  prediction  of  human  performance  fi'om  the  MACE  system.  These  6  predictcs-  variables;  i.e., 
LMEA2,  LMEA3,  BMEA,  BVAR,  BMEA3,  and  LMAX  produced  a  multiple  Rg  of  .93  (F(6>=332.6,/><.01)  which 
is  almost  as  high  as  the  multiple  Rj^  of  .95  obtained  when  the  16  variables  are  used.  The  weighrittg  coefficients 
for  the  subset  of  6  variables  is  presented  in  Table  3. 


Number  of  Predictor  Variables 

Figure  6.  The  effect  of  the  number  of  predictor  variables  and  type  of  variable  (Harness,  color,  or  texture) 
(m  the  size  of  the  multiple  correlatm. 
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To  test  the  reliability  of  this  parsimonious  six-variable  predictor  model,  the  original  raw  data  set  of  345 
observations  was  divided  into  two  parts.  The  first  part  consisted  of 305  observations  that  formed  a  prediction  data 
subset  (Group  1)  that  was  used  to  obtain  a  new  set  of  predictor  weighting  coefficients  for  the  six  variables.  The 
second  part  consisted  of  the  remaining  40  observations  (Group  2)  that  were  previously  randomly  selected  from  the 
345  observations.  The  new  set  of  weighting  coefficients  obtained  from  Group  1  was  then  applied  to  the  same  six 
variables  of  the  40  observations  of  Group  2.  The  prediction  equations  were  then  solved  for  the  Group  2 
observations  which  yielded  predicted  camouflage  ensemble  ratings  for  this  verification  groiq)  that  were  compared 
with  the  actual  ratings  of  this  group.  The  Pearson  correlation  coefficient  obtained  from  the  comparison  of  the 
predicted  and  actual  ratings  was  an  r  of  0.90.  The  correlation  and  regression  between  the  predicted  and  actual 
soldier  camouflage  ratings  for  this  parsimonious  six-variable  predictor  model  is  shown  in  Figure  7.  The  close  fit 
aiggftgtg  that  fliis  parsimonious  version  is  an  efficient  and  usefol  model  for  predicting  hiunan  ratings  of  camouflage 
effectiveness. 
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P  red  icted  Rating 

Figure  7.  Plot  shows  the  correlation  between  the  predicted  and  the  actual  soldi^  camouflage  effectiveness  ratings. 
The  model  is  based  on  six  MACE  system  predictor  variables  (LMEA2,  LMEA3,  BMEA,  BVAR, 
BMEA3,LMAX).  Forty  cases  plotted.  Correlation  =  .90  R  Squared  =.81  S.E.  of  Est.=.54 
2-tailed  Sig.=.000  Intercept  (S.E.)=  -.  15  (.35)  Slope  (S.E.)=  1 .07  (.08) 


TABLE  3 

Weighting  Coefficients  of  Six  Parsimmiious  Predictor  Variables  to  Obtain  an  Efficient  ModeL 


Step  MultR 

Rsq 

Variable 

Type 

B 

Beta 

T 

SigT 

1 

.836 

.698 

LMEA2 

L 

-  .003120 

-3.479084 

-19.7 

.000 

2 

.879 

.773 

LMEA3 

L 

3.18050E-05 

2.422736 

13.6 

.000 

3 

.905 

.818 

BMEA 

B 

.081652 

.717738 

14.8 

.000 

4 

.919 

.845 

BVAR 

B 

-  .027090 

-  .164391 

-5.7 

.000 

5 

.928 

.862 

BMEA3 

B 

-2.55107E-05 

-  .259992 

-6.6 

.000 

6 

.933 

.870 

LMAX 

L 

-  .006124 

-.134595 

-4.3 

.000 

(Constant) 

4.625755 

67.9 

.000 

Note.  MultR  =  Multiple  correlation  coefficient;  Rsq  =  the  Multiple  correlation  coefficient  squared  (explained 
variance);  Type:  (L)  lightness  scale,  (A)  red-green  scale,  (B)  yellow-blue  scale;  B  =  the  raw  data  weighting 
coefficioit  (slope);  Beta  =  the  standardized  weighting  coefficient;  T  =  slope  /  standard  error  of  the  slope;  Sig  T  = 
the  statistical  significance  of  T. 
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The  Relative  Contribution  of  Lightness,  Color,  and  Texture  Information  to  the  Predictive  Models 


While  the  parsimonious  six-variable  predictor  model  appeared  useful,  it  was  noted  that  most  of  the 
higher-weighted  variables  were  related  to  lightness  and  the  yellow-blue  color  scale  whereas  the  lessor-weighted 
variables  tended  to  be  measures  of  texture.  Consequently,  two  additional  stepwise  multiple  regression  analyses 
were  performed  to  assess  the  relative  contribution  of  the  basic  lightness  and  color  variables  versus  the  texture 
variables.  The  results  of  these  analyses  are  also  shown  in  Figure  6.  It  can  be  seoi  that  the  stepwise  multiple 
regressions  based  on  the  lightness  and  color  variables  alone  produce  virtually  the  same  predictive  infOTmation  as 
when  the  lightness,  color,  and  the  texture  variables  are  used  together.  In  contrast,  the  multiple  regressions  based 
cm  texture  variables  alone,  are  markedly  below  those  based  exclusively  on  the  lightness  and  color  variables.  From 
this  finding,  it  was  determined  that  a  model  based  solely  upon  the  lightness  and  color  variables  could  serve  as  a 
useful  model  for  predicting  human  ratings  of  camouflage  effectiveness. 

To  further  test  the  effectiveness  of  a  model  based  cm  lightness  and  color  variables  wittiout  the  effect  of 
texture,  a  second  model  was  produced.  The  variables  chosen  for  this  model  were  the  first  six  variables  obtained 
fi-om  the  multiple  r^ession  analysis  that  was  done  considering  only  lightness  and  color  related  wiables.  These 
six  variables  were  LMEA,  BMEA,  BEAT,  BVAR,  BBAT2,  and  ABAT2.  The  weighting  coefficients  for  this 
lightness  and  color  model  are  shown  in  Table  4.  The  multiple  R  obtained  firom  these  six  variables  was  .92.  The 
reliability  of  the  coefficients  obtained  firom  fiiese  six  variables  was  an  r  of  .86.  Thus,  fiirther  evidence  was  found  to 
suggest  that  the  parsimonious  six-variable  predictor  model  wliich,  also,  contains  only  lightness  and  color  variables 
(see  Table  3)  is  an  effective  and  useful  model  for  predicting  human  ratings  of  camouflage  effectiveness. 


TABLE  4 

Wei^ting  Coefficients  of  Six  Predictor  Variables  to  Obtain  a  Model  Based  on  only  Light  and  Color  VariaUes 

(versus  Texture  Variables) 


Step  MultR 

Rsq 

Variable 

Type 

B 

Beta 

T 

SigT 

1  .843 

.711 

LMEA 

L 

-  .076475 

-1.226763 

-26.0 

.000 

2  .891 

.794 

BMEA 

B 

.072358 

.636039 

13.3 

.000 

3  .909 

.827 

BEAT 

B 

-  2.475205 

-  .370344 

-5.2 

.000 

4  .920 

.846 

BVAR 

B 

-  .030880 

-  .187390 

-6.3 

.000 

5  .922 

.850 

BBAT2 

B 

1.283458 

.176110 

2.5 

.012 

6  .924 

.854 

ABAT2 

A 

.564369 

.080290 

2.8 

.005 

(Constant)  5.560999 

72.1 

.000 

Note.  MultR  =  Multiple  correlation  coefficient;  Rsq  =  the  Multiple  ctarelation  coefficient  squared  (e}q>lained 
variance);  Type:  (L)  li^tness  scale,  (A)  red-green  scale,  (B)  yellow-blue  scale;  B  =  the  raw  data  weigjiting 
coefficient  (slr^e);  Beta  =  file  standardized  weighting  coefficiait;  T  =  slope  /  standard  errOT  of  the  slope;  Sig  T  = 
the  statistical  significance  of  T. 
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DISCUSSION 


The  results  of  the  various  statistical  analyses  described  herein  indicate  that,  of  the  L  (lightness  scale),  A 
(red-green  scale),  and  B  (yellow-blue  scale)  variables,  the  most  predictive  were  those  involved  with  the  lightness 
scale,  followed  by  those  concerned  with  the  yellow-blue  color  scale.  The  importance  of  the  lightness  scale 
variables  suggests  that  a  target  iway  be  camouflaged  to  a  large  extent  simply  by  the  closeness  of  its  reflected  gray¬ 
scale  light  level  pattern  to  the  overall  background  gray-scale  pattern.  As  for  the  role  of  the  yellow-blue  color  scale 
variables,  given  the  woodland  background  used  in  this  study  (>^ich  exhibited  more  of  the  colors  in  the  yellow-blue 
scale  than  the  red-green  scale)  it  is  reasonable  to  find  that  the  human  rating  of  high  camouflage  effectiveness 
would  be  associate  with  camouflage  ensemble  colors  patterns  most  similar  to  the  colors  in  the  woodland 
background. 

It  was  interesting,  however,  to  discover  that  the  texture  variables  were  less  highly  weighted  than  were  the 
variables  involved  with  lightness  and  color.  This  finding  leads  one  to  consider  how  the  chstance  fector  may  play  a 
role.  Generally,  the  ability  of  a  viewer  to  discern  the  details  of  an  object  are  related  to  the  distance  between  the 
viewer  and  the  object  (Gibson,  1950).  The  viewing  distance  recommended  by  the  developers  of  the  MACE  system 
for  the  apparatus  was  20-100  meters  (Goodell  &  Kilian,  1992).  However,  to  provide  some  camouflage  effect,  we 
felt  that  20  meters  would  be  too  close;  i.e.,  the  target  object  would  look  artificial  and  be  too  easy  for  the  soldier 
raters  to  discern.  On  the  other  hand,  at  the  longer  distances,  the  image  of  the  target  pattern  on  the  MACE 
computer  display  would  involve  so  few  pixels,  the  opportunity  for  the  MACE  system  computer  algorithms  to 
discriminate  among  the  patterns  would  be  minimal.  Consequently,  the  distance  of  30  meters  was  chosen  as  being 
just  fer  enough  away  to  provide  some  camouflage  effect  of  the  target  patterns,  yet  would  still  provide  the  MACE 
system  sufficient  target  pattern  pixels  to  allow  the  computer  algorithms  to  analyze  textural  differences.  While  the 
30  meter  distance  selected  appeared  optimum  it  may  still  not  have  allowed  the  textural  variables  to  ccmtribute 
sufficient  information  to  discriminate  among  the  patterns  used. 

In  conclusion,  the  objectives  of  this  study  were  to  determine  viiether  or  not  the  MACE  system  variables 
could  discriminate  among  a  range  of  camouflage  patterns  and  correlate  with  human  judgments  of  camouflage 
effectiveness.  The  results  of  the  investigation  provided  a  combination  of  16  weighted  MACE  system  variables  that 
correlated  highly  with  the  soldiers'  pattern  ratings.  Further  analyses  yielded  an  even  smaller  nuniber  of  MACE 
system  variables  (the  six-variable  parsimonious  predictor  model),  ’Miich  involved  only  lightness  and  color 
variables,  and  provided  virtually  the  same  predictive  information  as  the  16  variable  model.  It  was  also  suggested 
that  the  MACE  system  could  be  further  enhanced  to  exploit  camouflage  texture  information  at  close  ranges  which 
might  further  increase  the  system's  performance  in  predicting  human  perception  of  camouflage  effectiveness  and 
perhaps  lead  to  the  development  of  improved  camouflage  textural  patterns.  These  possibilities  deserve  further 
investigation.  Overall,  based  on  the  results  of  this  study,  the  MACE  system  concept,  as  a  cost-effective  means  of 
automating  the  process  of  evaluating  camouflage  effectiveness,  appears  promising. 
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ABSTRACT 


The  analysis  of  variance  procedure  produces  no  estimate  of  interaction  when 
there  is  insufficient  repetition  within  cells  to  estimate  error.  An  example  of 
this  is  the  two-way  layout  with  one  observation  per  cdl.  Johnson  and  Graybill 
presented  a  method  for  that  situation  which  allows  the  identification  of  those 
cells  which  may  contain  interaction.  The  identified  cells  are  excluded  from  a 
subsequent  analysis  of  variance. 

In  this  paper,  an  alternate  method  is  presented  for  the  identification  of  those 
cells  in  a  tv\'0-way  layout  with  one  observation  per  cell  with  interaction, 
which  also  produces  an  estimate  of  the  interaction  or  overall  deviation  from 
additivity.  The  method  produces  n  (=  at)  "t-statistics  ,  one  for  each  cell,  for 
determining  the  significance  of  the  estimates.  Although  not  necessary  ,  it  is 
prudent  to  assume  that  the  number  of  cells  which  contain  interaction  is  small. 

INTRODUCTION. 


The  two-way  analysis  of  variance  for  data  which  may  contain  interaction  has 

produced  a  small  body  of  interesting  literature.  Tukey^  was  the  first  to 
propose  a  method  for  determining  whether  interaction  exists  in  this  situation. 
Tukey's  method  involves  the  insertion  of  a  single  parameter  into  the  model 
which  requires  only  one  degree  of  freedom  for  testing  for  no  interaction.  Let 


where 


yy  =  Ji  +  ai  +  Tj  +  XcXjTj  +  £ij 

i  —  1,2, ...,  aj  j  =  1, 2, ...,  t, 

9  a 

eji  are  NID{0,  o^);  and  V aj  =  Yxj  =  0. 


i=l 


i=l 


In  this  model,  X  is  a  "global"  interaction  coefficient  which  does  assume  a 
certain  structure.  Several  variants  of  this  model  have  been  proposed  and 
studied,  while  other  authors  have  studied  the  problem  of  identification  of  the 
specific  cells  which  contain  interaction.  A  review  of  this  literature  may  be 
found  in  Johnson  and  Graybill^. 

Johnson  and  Graybill2,3  produced  a  satisfactory'  solution  to  this  problem.  After 
applying  Tukey's  test  to  determine  whether  or  not  interaction  does  exist, 

*  The  research  for  this  paper  was  partially  completed  while  the  author  was  at  the 
University  of  South  Carolina,  Columbia,  South  Carolina. 

Approved  for  public  release;  Distribution  is  unlimited. 
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they  estimate  the  interaction  in  all  possible  2x2  contrasts,  and  declare  a  cell  to 
contain  interaction  if  it  is  contained  in  a  large  number  of  those  conmasts 
which  produce  a  significant  statistic.  This  is  a  significant  step,  but  the  total 

number  of  2X2  contrasts  is  (2)  (2)  •  quantity  is  always  larger  than 

n  =  at  for  a  >  3,  and  t  >  3,  and  can  be  much  larger  than  n  for  even  moderately 
large  a  and  t.  Since  each  of  the  n  cells  is  contamed  in  (a  -  l)(t  -  1)  distinct  2x2 
contrasts,  judgement  is  required  to  determine  if  a  particular  cell  should  be 
declared  "significant"  or  not.  The  simultaneous  inference  problem  is  avoided 

by  the  examination  of  all  of  the  (2)  (2)  statistics.  Furthermore,  no  follow¬ 
up  analysis  is  required  Since  all  of  the  information  is  provided  in  one  ■ 

application  of  their  method.  Their  method  does  not  produce  an  estiinate  of  the 
size  of  the  interaction.  This  is  not  a  strong  criticism,  but  some  sort  of  estimate 
of  the  non-additivity  would  be  useful. 

In  this  paper,  a  method  is  proposed  which  makes  no  assumptions  about  the 
structure  of  the  interaction.  It  produces  individual  estimates  and  associated 
"t-statistics"  for  interaction  in  each  of  the  n  =  at  cells  of  the  two  way  layout. 
The  issue  of  simultaneous  inference  is  not  completely  eliminated  since  some 
follow-up  analysis  may  be  desired  if  the  total  number  of  cells  containing 
interaction  is  large.  That  is,  the  t-statistic  in  a  given  cell  is  computed  with  the 
error  estimate  based  on  the  observations  in  all  of  the  other  cells.  If  several 
cells  are  discovered  to  contain  interaction,  it  might  be  desirable  to  conduct  a 
final  test  based  on  the  removal  of  all  of  these  cells. 

THE  GENERAL  CASE 


A  general  solution  to  the  problem  is  presented  here,  after  which  several 
special  cases  will  be  examined.  The  general  notation  is  that  of  Graybill  .  To 
motivate  additional  notation,  notice  the  following  design  matrix.  This  matrix 
represents  a  3x3  array  with  one  observation  per  treatment  combmanon  m 
which  cells  (1,1)  and  (1,2)  include  interaction.  That  is,  two  observations 
involve  interaction.  Notice  the  2x2  identity  matrix  in  the  upper  ngm  hand 
comer  of  this  matrix,  and  notice  the  5x2  matrix  of  O's  just  below  it.  The  model 
which  would  generate  this  design  matrix  is: 


_  ryy  =  |i  +  aj  +  Tj  +  Xy 
^ij'tyij  =  |A  +  aj  +  Tj  +  ey 


+  8 


1] 


i,j  =  1,1  or  1,2 
otherwise 


F  «i  “2  “3  ■'1  ■'2  ■'S  ^11  ^12 
1  1  0  0  1  0  0  1  0  I 


110  0 
110  0 
10  10 
10  10 
10  0  1 
10  0  1 


0  10 
0  0  1 
1  0  0 
0  0  1 
1  0  0 
0  0  1 


0  1 
0  0 
0  0 
0  0 
0  0 
0  0 


More  generally,  let  yy  ,  i  =  1,  2, ...,  a  and  j  =  1,  2, ...,  t  represent  n  =  at 
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observations  of  which  q  involve  interaction.  (Later,  restrictions  on  the  sizes 
of  a,  t,  and  of  q  relative  to  a  and  t  which  will  allow  testing  will  be  apparent.) 
Let  r  represent  the  set  of  the  cells  in  which  interaction  occurs.  Then 


^  fyy  =  ^  +  aj  +  tj 
yy  |yy  =  |i  +  aj  +  tj 


+  +  £ij 


+  e 


IJ 


i,j  €  r 
ij  ^  r 


It  will  be  convenient  to  represent  these  models  in  matric  form  as  follows.  Let 

r  y  1 

y  represent  the  vector  of  observations  and  let  y  =  y  ^  >  where  y  p  contains 
those  observations  in  y  which  involve  no  interaction,  and  y^  contains  those 

observations  in  y  which  involve  interaction.  Let  p  =  represent  the  pxT 
vector  with  components  (|a,  aj,  02,  ~ ag,  ,  t2 ,  ),  where  p  =  (a  + 1  +  1)  and 

let  Y  represent  the  vector  of  all  interaction  parameters  in  the  model,  where  q 
is  the  total  number  of  these  parameters.  Let  be  the  qxp  matrix  which 
corresponds  to  those  observations  which  involve  the  interaction  parameter,  y> 
and  let  Xo  be  the  {n-q)xp  matrix  corresponding  to  those  observations  which  do 
not  involve  the  interaction  parameter.  Then  the  design  matrix,  in  partitioned 
form,  appears  as  follows  (where  0  is  an  (at-q)xq  matrix  of  O's  and  Iq  is  the  qxq 

identity  matrix): 

I. 


X  = 


Xp  0 


The  model  may  be  represented  as. 


+  e 


where  e  represents  the  vector  of  errors.  A  formal  manipulation  results  in  the 
following. 


XX  = 


and  the  normal  equations  are: 


x;x,+x^p  x; 


^q  J 


rx;x^+x^p  x;irpi  [x;  Xpiry,  i 
x^  Iqj  Ly  J"!  Iq  0  J[ypJ 


The  conditional  inverse  or  c-inverse  of  X'X  denoted  by  (X'X)  is: 

(X'X)^=  f  -(Xf^p)^ 

‘  L  'X,(XpJCp)^  Iq+X^(X^p)X  .  ' 


Finally,  (X  X)  X  is: 
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0 

I,  -X,(X^pl%. 


Then,  the  least  squares  estimator  of  p  and  y  are; 

^-(X^p)S(^yp 

and, 

Y=yY-XY(x^p)%yp 

'“yy-Xy^- 

Thus  the  algorithm  is  a  2-stage  operation  consisting  in  obtmning  estimators  of 
p  using  only  those  observations  which  contain  no  interaction,  and  then  insert 
these  into  the  formula  for  obtaining  the  estimate  for  y  • 

The  estimate  of  the  error  variance  is  obtained  in  the  usual  way.  Let  0 
represent  the  vector  [  J  ]  and  ^  represent  the  corresponding  vector  of 

estimates.  Then,  substituting  y  =  y ^  -  XyP  the  r.h.s.  of  the  first  equality 

immediately  following,  the  result  is: 

(y-X^)’(y-X^)  = 


1 

>ql 

rn 

- 

1 

■ 

[yy  1 

Xy  Iq  ■ 

\^] 

[ypJ 

[Xp  0  J 

A 

Y  J 

[ypJ 

.  Xp  0  . 

A 

.  Y  J 

. 

0  ■ 
y  p-Xp^ 


•  0  • 
_yprXfP 


=  (yp-Xpfc'(yp-Xpfe 

The  error  variance  is,  therefore,  estimated  only  with  those  observations 
which  are  not  effected  by  interaction.  Note  that 

(I  -  Xp(X^p)S(^  is  symmetric  and  idempotent.  Since  the  rank  of  Xpis  q 
less  than  the  rank  of  X  then 


Assuming  that  the  errors  are  NID(0,  o^I ),  then  is  distributed  as  a  chi-square 
random  variable  with  (a-l)(t-l)-q  degrees  of  freedom,  and  ^  is  the  UMVUE  of 
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o^.  An  examination  of  the  degrees  of  freedom  indicates  that  in  a  2x2  layout,  it 
would  be  impossible  to  estimate  an  interaction  coefficient.  In  a  3-level  layout, 
there  are  enough  degrees  of  freedom  to  simultaneaously  estimate  the 
interaction  coefficients  associated  with  no  more  than  3  cells.  In  general,  this 
"saturation"  Umit  is  (a-l)(t-l)-l. 

It  is  important  to  point  out  that  the  linear  forms  y  =yy  -  Xy{X^p)S(pyp  are 
independent  pf  the  quadratic  form  yp  (I(n-q) "  Xp(X^p)Scp)  yp  .  To  see  this, 
write  both  forms  as  functions  of  y.  If  y  is  written  as  B  y,  then 

B.  [  >q  . 

L  ®(n-q)xq  ®(n-q)x(n-q)  . 

a7  1  ’ 

If  the  quadratic  form  for  is  represented  as  (a.i)(t-l)-qX  ^  ^ 

r  ®qxq  ®qx(n-q) 

.  ®(n-q)xq  ^  - 

It  is  easily  seen  that  A  and  B  are  conformable  and  that  B  A  =  0.  'ffius,  from  a 
well  known  theorem  on  the  independence  of  linear  and  quadratic  forms  (see 

Theorem  4.5.1,  p.  137,  in  Graybill^)  the  linear  forms  representing  y  are 
independent  of  the  quadratic  form  .  The  result  is  that,  given  normal  and 
independent  errors,  the  components  of  y  are  normally  distributed  and 
independent  of  and  that  t-statistics  can  be  constructed  to  test  for  the 
presence  of  interaction.  Before  doing  that,  the  moments  of  y  are  required. 

It  is  easily  seen  that  y  is  unbiased  for  y  by  substituting  (2.6)  into  (2.5)  and 
using  (2.2).  Then,  Con  [r]  =  E[(Y-T  Xy-t  )']  =  Con  [By] ],  and 


Iq+X,(X|?A 

0(n-q)xq 


Oqx(n-q) 
0(n-q)x(n-q)  . 


Thus, 


Cov[y]=o2(IqH-X^(X^p)S(;). 

WO  SPECIAL  CASES 


The  algorithm  is  based  on  the  examination  of  each  cell  individually  for  the 
presence  of  interaction.  The  cells  which  are  believed  to  contain  interaction 
are  simply  treated  as  missing  cells  in  the  following  ANOVA  procedure.  The 
formal  estimates  are  given  below. 

Suppose  that  the  observations  yjj  arise  from  the  two-way  classification  model 
with  one  observation  per  cell,  i  assumes  the  values  1,  2, ...,  a  and  j  assumes  the 
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values  1,  2,  t,  uiid  where  exacUy  one  of  the  n  -  at  cells  is 
interaction.  Designate  that  cell  as  cell  (K,L)  where  1  s  K  s  a  and  ,1  s  L  s  t . 

The  previous  results  produce  normal  equations  and  the  parameter  estimates. 

The  customary  dot  notation  is  used:  yi,=  andyi^=^yi..  The  superscnpt  O 

refers  to  summation  within  the  row  or  column  which  contains  the  cell  with 

O  .  vP-  —  vP  •  while  V® refers  to  the  overall  sum 

interaction:  y^=  XyKj  ’  yK.-t-l 
j=l.  j-'L 

except  for  the  (K,L)-th  ceU;  y°= Y?  5  ^^c.  The  estimators  are  based  on  the 

a  t 

conditional  inverse  using  the  usual  constraints,  ^“i'^ 

i=l  i=l 

*4  =  ^yfc'  y?) 

4  =  ^yj.  ■  ^  t(fciT  -  y^ 

4  =  (yi.-  y?)  - 

Y KL  ^yKL  ■  y^  '  ^y^- '  "  tT  ^y -L  ■  y^ 


A2_  _ 1 - 

°  -  (a-l)(t-l)-l 


‘  A  A  A  2 

/yy-li  -ar-'j) 


at 


The  moments  of  ykL  ^re:  B  [yklJ  =YKL  -  t^Kll  =  (a-l)(t-l)  °  ' 

From  the  results  of  part  2,  f  kL  ^nd  are  independent  so  that  a  for  a 

non-zero  value  of  ykl  student’s  t  with  (a-l)(M)'l  degrees  of  freedom. 

Under  the  null  hypothesis,  y  {a  zero,  and  the  test  is  two-sided. 


t  = 


A 

VKL 


at 


l)(t-l) 
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In  the  following  example,  it  will  be  of  interest  to  simultaneously  test  two  cells 
for  significance  Suppose  that  there  are  two  cells  which  contam  interaction. 
The  estimates  differ  formally,  depending  upon  whether  the  two  cells  are 
either  in  the  same  row  or  column  or  in  distinct  rows  and  distinct  columns. 

Suppose  the  cells  are  in  the  same  column.  Call  the  cells  (K,M)  and  (L,M).  The 
solution  to  the  normal  equations  produce  the  following  estimates: 

^  tiij  ‘  iiti)  aitii)  -y.) 

«K  =  sitidT  ^y?M‘  y^ 

“L  =  (yi.  ■  y^  aft^  ^y^M  -  y^ 

^M=  (yj^-  y3+{^^yK.-  y^+i^^yh-  y^ 

«i  =  (yi  -  y^  ■  nizT  ‘  a(fey  ■  y3  •  a(t-i)  ^y -m  ■  y 
(y.j-  y?)‘  sifey  ^y?.  ■  y^  ■  a(t-iy  ^y -m  ■  y 

yV-  (yKM-y^  -£i(yK.-  y°) -^(yi.-  y?) -iT  (y^M-y^ 

ylm=  (yLM-y?)  ‘^(yL-  y^^i-^yK.-  y^-^^yM-y^ 

1  ^  A  A  t  2 

If  the  two  cells  which  contain  interaction  are  in  the  same  row,  then  the 
estimates  can  easUy  be  obtained  from  the  estimates  given  above  by  an 
interchange  of  indices  a  and  t.  The  parameter  estimates  when  the  two  cells 
occupy  distinct  rows  and  columns  are  much  more  complicated  to  write  out. 

The  solutions  to  the  two  previous  cases  and  all  others 

proceeding  as  follows.  The  normal  equations  are  reduced  to  full  rank  by  first 

substituting  into  the  other  equations.  Then,  the  normal  equations  are 
OT^ged  so  fhat  the  coefficients  of  the  cells  which  contam  m  erachon  ^e 
listed  first  beginning  on  the  left,  followed  by  the  other  coefficients.  Thi 
produces  a  "reduced"  X'X  equation  of  the  following  form: 


R=[ 


^qxq  ®qx(a-rt-q)  j 
B(a+t-q)5ai  *  (a4t-q)x(art-q)  J 


where  A  contains  only  the  coefficients  of  the  oiand  the  tj  corresponding  to  ■ 
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the  cells  which  contain  Interaction,  0  contains  the  coefficients  of  all  other 
and  Tj ,  and  B  is  a  rank  1  matrix  each  row  of  which  contains  the  additive 

coefficients  of  (i .  Then, 


R'^  = 


.-1 


0 


-BA'^  1 


The  a*  and  x;  which  involve  interaction  can  be  solved  as  a  set  and  then 
substimted  into  the  equations  for  the  and  Xj .  This  saves  a  lot  of  labor. 

AN  EXAMPLE 

Johnson  and  Graybill^  produced  an  analysis  of  data  in  a  3x5  layout  with  one 

observation  per  cell  which  originally  appeared  in  BlackS.  in  their 

first  Tukey's  test  is  used  to  determine  if,  in  fact,  interaction  is  present.  If  it  is 

found  to  be  present,  then  k  -  (|)  (2)  significance  among  all 

possible  2x2  contrasts  of  the  cells  in  the  layout  are  conduct^.  Apparently 
then,  those  cells  which  are  contained  in  a  large  number  of  significant 
contrasts  are  declared  to  contain  interaction.  Those  cells  are  treated  as 
missing  observations  in  a  "conventional  ANOVA.  Since  each  cell  is  contamed 
in  (a-l)(t-l)  of  the  contrasts,  some  of  the  cells  which  do  not  contain 
interaction  will  appear  in  some  of  the  "significant"  contrasty. 
judgement  is  involved  in  deciding  which  cells  are  the  sigmficant  ones.  The 
data  in  their  example  is  given  in  table  1.  below.  In 

was  determined  to  be  significant  since  all  of  the  8  contrasts  which 

contained  that  cell  produced  significant  statistics,  fhe  contrast  for  cell  (2,1) 
and  cell  (3,3)  also  produced  one  significant  statistic,  but  that  was 
determined  to  be  due  to  statistical  variation.  The  mean  squared  error  when 
cell  (1,1)  was  removed  from  the  analysis  was  6,228.34 

1  YTFl  n  IN  Icg/ha  OF  SPRING  WHEAT_ 


Phosphorous  in  kg/ha 


Nitrogen  in 
kg/ha 


Total 


0  22 


45 


90  180 


TOTAL 


0  1984  2550  2706  2740  2954  12934 
45  1776  2843  3306  3305  3386  14616 
90  1797  2761  3240  3227  3332  14357 

S<^57  8154  9252  9272  9672  _ _ 41907 


In  our  analyis,  a  t-test  was  conducted  on  each  ceU  to  are 

present.  No  preUminary  test  tor  nonadditivity  was  conducted.  The  results  are 
^ven  in  table  2.  The  t-statistics  were  based  on  7  degrees  of  freedom. 
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2.  ANALYSIS  OF  SPRING  WHEAT  DATA 


Cell  Interaction  t-statistic  Mean 

Estimate  Squared 

Error 


(1,1) 

634.999 

(1,2) 

73.125 

(1,3) 

-320.625 

(1,4) 

-269.375 

(1,5) 

-118.125 

(2,1) 

-385.750 

(2,2) 

-8.250 

(2,3) 

173.625 

(2,4) 

159.250 

(2,5) 

61.125 

(3,1) 

-249.250 

(3,2) 

-64.875 

(3,3) 

146.999 

(3,4) 

10.125 

(3,5) 

57.000 

5.875* 

6228.344 

0.279 

36542.84 

-1.372 

29117.84 

-1.109 

31421.64 

-0.455 

35887.12 

-1.760 

25612.87 

-0.031 

35945.07 

0.681 

34653.43 

0.621 

35018.02 

0.233 

36665.57 

-1.014 

32216.87 

-0.248 

36629.58 

0.571 

35303.84 

0.424 

36026.25 

0.217 

36702.71 

The  test  indicated  that  only  the  (1,1)  ceU  contains  interaction  which  agrees 
with  the  original  analysis3.  The  statistic  5.875  for  cell  (1,1)  corresponds  to  a  p- 
value  of  about  0.0003  with  7  degrees  of  freedom. 


It  may  be  argued  that  other  cells  which  produce  a  large  (even  if  not  highly 
significant)  statistic  should  be  checked  for  significance  with  cell  (1>1) 
removed,  since  the  error  estimate  would  be  reduced,  resulting  in  a  higher 
^ahie  of  thTstatistic  for  that  cell.  In  this  example,  cell  (2,1)  produced  the  next 
highest  t-statistic  ( -1.76  )  with  error  estimate  25,612.87  .  The  error  estimme  is 
lower  that  that  produced  by  the  other  cells  which  were  in  the  range  30,000  to 
36  000  Then,  if  cell  (1,1)  is  removed  from  the  analysis,  a  reduction  m  tne 
error  estimate  could  produce  a  significant  t-statistic.  There  is  a  compen^anng 
effect  namely  that  the  removal  of  the  observation  in  cell  (1,1)  would  reduce 
the  overall  average  in  the  first  column  which  would  tend  to  reduce  the 
interaction  estimate  in  cell  (2,1). 

As  an  illustration,  the  special  case  estimate  given  in  previously  i^  used  to 
pro^ce  -revised"  estimates  for  cells  (1,1)  and  (2,1),  which  are  presented  m 
table  3.  This  is  not  the  recommended  procedure  because  the  co^^lation 
between  cells  (1,1)  and  (2,1)  is  neglected.  A  better  test  is  described  later. 
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3.  REVISED  ESTIMATES  OF  CELLS  (1.1)  AND  (2.1) 

Interaction  t-statistic  Mean 

Estimate  Squared 

Error 


(1,1) 

(2.1) 


5895 

-91.00 


4.536* 

-.700 


6755.66 

6755.66 


This  analysis  confirms  the  original  conclusions.  The  estimate  for  cell  (1,1)  is 
highly  significant  and  cell  (2,1)  is  not  significant. 


As  stated  previously,  a  better  procedure  is  to  conduct  a  test  which  takes  full 
advantage  of  the  correlation  between  cells  (1,1)  and^(2,l).  Based  on  the  results 

of  Section  3,  the  joint  distribution  of  the  estimates  y  n  and  y21  assuming 

normal  and  independent  errors,  jointly  normally  distributed  with  means  Yh 

t  _2 


and  y21  »  variances  (t!i^(a-2)  covariance  (t.i)(a-2)  ® 


Since  cell 


(1,1)  has  already  been  judged  to  contain  interaction,  the  recommended  ^ 
procedure  is  to  test  the  hypothesis  that  y21  ^ero,  given  that  y  ii  =  635  and  y  n 
=  589.5.  The  appropriate  distribution  is  the  distribution  of  Y21  Y  n  and 
ill .  The  conditional  distribution  of  y 21  >  (  Y  u ,  YU )  is: 


,  1  /  ^  ^ .  at  2^ 

^^Y21+^^m-Yll)  >  (a-l)(t-l)'’ 

The  appropriate  values  are:  y21  =  '^1!  yii=635;  y  ii=  589.5 ;  and  a  =6,755. 
These  produce  the  t-value  -1.213.  The  appropriate  degrees  of  freedon  are  6. 
This  result  agrees  with  the  previous  conclusion. 
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EXPLORATION  OF  SATELLITE  IMAGES  IN  THE  DYNAMICALLY  LINKED 
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J.  Symanzik*,  D.  Cook,  S.  Klinke,  and  N.  Lewin 
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ABSTRACT 

The  dynamic  link  among  the  Geographic  Information  System  (GIS)  ArcView,  the  dynamic  statistical 
graphics  (DSG)  program  XGobi,  and  the  statistical  computing  environment  XploRe  has  been  successfully 
used  to  explore  and  analyze  all  kinds  of  spatially  referenced  data  —  from  forest  health  data  over  precipitation 
data  to  precision  agricultural  data.  In  this  paper,  we  will  focus  on  the  exploration  of  satellite  images  using 
the  linked  ArcView/XGobi/XploRe  software  environment. 

INTRODUCTION 

Multispectral  satellite  image  data,  i.  e.,  remote  sensing  data  gathered  through  earth  observation  satellites 
attracted  the  attention  of  statisticians  working  in  the  field  of  dynamic  statistical  graphics  (DSG)  during  the 
last  decade.  Many  prototypes  of  software  systems  have  been  developed  over  time  that  focus  on  many 
interesting  software  features  but  yet  lack  the  full  functionality  that  is  required  to  geographically  explore 
satellite  images. 

The  ArcView/XGobi/XploRe  software  environment  consists  of  the  Geographic  Information  System  (GIS) 
ArcView  3.0™,  the  dynamic  statistical  graphics  (DSG)  program  XGobi  [25],  and  the  statistical  comput¬ 
ing  environment  XploRe  [13].  These  three  programs,  linked  together  into  one  single  environment,  provide 
features  such  as  linked  brushing  among  multiple  windows  of  different  types  (e.  g.,  map  views  and  statisti¬ 
cal  plots),  grand  tour  [1],  [2],  projection  pursuit  guided  tour  [6],  and  standard  clustering  and  classification 
methods,  that  are  very  powerful  when  jointly  used  for  the  exploration  and  classification  of  satellite  images. 

In  the  following  sections  we  will  look  at  examples  from  the  literature  where  satellite  images  have  been  an¬ 
alyzed  using  dynamic  statistical  graphics.  We  describe  the  ArcView/XGobi/XploRe  software  environment, 
and  we  explain  how  it  can  be  used  for  remote  sensing  data.  We  finish  with  a  discussion  and  an  overview  on 
future  directions. 


SATELLITE  IMAGES  AND  DSG 

Various  statistical  methods  have  been  developed  for  the  classification  of  remote  sensing  imagery.  Many  of 
these  methods  are  based  on  principal  components  and  are  widely  used  in  the  remote  sensing  community  for 
non-supervised  clcissification  purposes.  Other  statistical  approaches  that  are  also  well-acknowledged  in  the 
remote  sensing  community  have  been  focused  on  the  optimum  band  selection  for  supervised  classification. 
However,  most  of  these  solutions  ignore  the  visual  capabilities  of  human  beings,  highly  depend  on  probably 
incorrect  ground  truth,  or  omit  potentially  valuable  information  at  an  early  stage  of  the  analysis. 

Exploration  of  satellite  images  via  DSG  is  mostly  based  on  human  interaction  and  visual  capabilities. 
This  approach  helps  to  detect  incorrect  ground  truth  and  it  does  not  ignore  any  of  the  information  that  is 
available.  Unfortunately,  there  exist  only  experimental  software  solutions  within  the  statistical  community 
so  far  that  support  this  approach.  Some  examples  of  the  exploration  of  satellite  images  through  DSG  follow. 

In  [4],  a  Landsat  2  multispectral  scanner  (MSS)  image  of  the  confluence  of  the  Rio  Solimoes  and  the 
Rio  Negro  near  Manaus,  Brazil,  has  been  considered.  The  authors  use  the  Data  Viewer  [3],  [14],  a  program 
for  the  graphical  data  analysis,  to  explore  the  four-dimensional  spectral  data.  The  authors  are  capable  to 
graphically  confirm  (through  linked  brushing  and  the  grand  tour)  that  the  measurements  on  a  transect  across 

^Approved  for  public  release;  distribution  is  unlimited. 

ArcView  3,0  is  a,  trademark  of  Environmental  Systems  Reseetrch  Institute,  Inc. 
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the  confluence  (mixing)  of  the  two  rivers  can  be  represented  as  a  convex  combination  of  the  corresponding 
two  water  spectra.  An  earlier  video  [19]  provides  more  details  of  this  analysis. 

In  [15],  the  simple  image  program  MTID  and  XGobi  have  been  linked  together.  The  authors  analyze 
Landsat  Thematic  Mapper  (TM)  images  of  the  agricultural  Parana  State  region.  Based  on  their  graphical 
analysis,  a  discrepancy  between  the  images  and  the  ground  truth  becomes  immediately  obvious.  Probably  a 
shift  of  the  recorded  ground  truth  in  relation  to  the  images  has  occurred. 

Scott  [23]  considers  four-channel  MSS  data  from  NASA’s  Landsat  IV  satellite,  taken  from  an  agricultural 
scene  in  North  Dakota  on  5  different  days  in  1977.  Scott  displays  trivariate  scatterplots  and  contours  of 
the  transformed  20-dimensional  data  and  he  finds  clusters  that  clearly  discriminate  between  sunflowers  and 
small  grains. 

Saleh  and  Scott  [21]  demonstrate  a  2-dimensional  density  grand  tour  of  a  Landsat  satellite  image  based 
on  three  groups  of  farm  crops.  The  density  grand  tour  reveals  that  at  least  one  of  the  crops  contributes  to 
the  multimodality  of  the  densities.  This  is  a  feature  that  is  also  detected  in  other  agricultural  settings  if  one 
(or  more)  crop(s)  has  been  planted  at  different  points  in  time,  thus  yielding  a  wide  variety  of  the  underlying 
spectral  measurements. 

Finally,  Carr  [5]  presents  a  hexagon  bin  scatterplot  matrix  based  on  the  seven  spectral  bands  of  a  Landsat 
image  at  a  Nevada  test  site.  This  binned  scatterplot  matrix  shows  areas  of  high  and  low  concentrations  of 
numbers  of  pairs  of  the  seven  spectral  bands,  almost  perfect  linear  relationships  between  several  pairs  of 
bands,  a  cutoff  of  intensity  values  for  the  thermal  band,  and  bivariate  outliers. 

Unfortunately,  in  the  last  two  examples  the  statistical  plots  are  not  directly  linked  to  the  original  satellite 
image.  Therefore,  no  immediate  geographic  conclusion  can  be  drawn  from  the  graphical  statistical  explo¬ 
ration  of  the  data. 


THE  ARCVIEW/XGOBI/XPLORE  ENVIRONMENT 

The  ArcView/XGobi/XpIoRe  environment  is  a  software  environment  where  the  Geographic  Information 
System  (GIS)  AreView  3.0,  the  dynamic  statistical  graphics  (DSG)  program  XGobi  [25],  and  the  statistical 
computing  environment  XploRe  [13]  have  been  linked  together,  thus  providing  us  with  three  different  main 
concepts  in  one  single  environment.  We  can  (1)  display  spatial  locations  ans  concomitant  variables  on  maps, 

(2)  visualize  these  concomitant  variables  using  interactively  manipulated  dynamic  statistical  graphics,  and 

(3)  conduct  numerical  statistical  analyses.  While  AreView’s  strengths  are  in  (1)  but  it  merely  supports 
(2)  and  (3),  XGobi’s  and  XploRe’s  strengths  are  in  (2)  and  (3),  respectively.  A  link  between  these  three 
programs  does  not  only  provide  the  sum  of  the  features  of  the  underlying  programs  but,  in  addition,  we 
can  gain  further  insight  and  understanding  of  our  data  through  interaction  and  dynamic  linking  among  the 
programs.  See  [27]  and  [26]  for  technical  details  of  the  software  environment  and  possible  applications  for 
spatial  data  analysis  such  as  real  estate  data  and  precision  agricultural  data. 

Remote  Procedure  Calls  (RPCs)  are  the  underlying  Interprocess  Communication  (IPC)  mechanism  used 
for  the  communication  among  the  three  programs.  More  details  on  this  IPC  technology  can  be  found  in  [24] 
and  [9]. 

In  addition  to  the  publications  on  the  entire  ArcView/XGobi/XploRe  environment,  there  exist  several 
articles  that  focus  on  the  preliminary  AreView/XGobi  link.  A  description  of  the  main  features  of  this  link 
(multivariate  data,  spatial  cumulative  distribution  functions,  variogram-cloud  plots,  spatially  lagged  scat¬ 
terplots,  and  multivariate  variogram-cloud  plots)  and  examples  that  range  from  forest  health  data  over 
precipitation  data  to  satellite  images  can  be  found  in  [7]  and  [8],  for  example.  In  [31],  a  comprehensive  sum¬ 
mary  on  the  technical  details  of  the  AreView/XGobi  link  has  been  given.  However,  what  has  been  stated 
there  still  holds  for  the  entire  ArcView/XGobi/XploRe  environment.  Recent  developments  on  XploRe  can 
be  found  in  [17],  [18],  and  [22]. 

EXPLORATION  OF  SATELLITE  IMAGES  THROUGH  ARCVIEW/XGOBI/XPLORE 

In  this  section,  we  look  at  three  different  examples  of  remote  sensing  imagery.  The  first  two  examples  only 
make  use  of  the  AreView/XGobi  link  and  have  been  described  in  more  detail  in  [29]  and  [30]  (Example  1) 
and  [7]  (Example  2).  Example  3  is  based  on  data  presented  in  [11]  and  [12]  and  makes  use  of  the  full 
Arc  View /XGobi /XploRe  environment . 
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EXAMPLE  1:  LAKE  ICARIA  WATERSHED 


Figure  1:  The  ArcView  map  view  (top)  is  linked  with  two  XGobi  views  (bottom).  The  left  XGobi  shows  the 
7  clusters  obtained  from  a  hierarchical  cluster  analysis.  The  right  XGobi  shows  a  scatterplot  of  the  variables 
Icariacl  and  Icariac2.  The  two  clusters  brushed  in  the  left  XGobi  fall  on  lake  locations  (circles)  or  areas 
with  active  vegetation  (filled  boxes)  in  the  ArcView  map  view.  They  also  form  two  of  the  corners  of  the  data 
triangle  in  the  right  XGobi. 


This  example  shows  three  ranges  of  wavelengths  of  electromagnetic  radiation  (expressed  through  the 
variables  Icariacl ,  Icariac2^  and  IcariacS)  measured  on  April  22,  1990,  by  the  SPOT  earth  observation 
satellite.  The  area  under  consideration  relates  to  the  Lake  Icaria  watershed  in  southwest  Iowa  (near  Corning). 
Instead  of  using  all  300,000  pixels  of  the  scene  that  are  available  in  ArcView,  we  took  a  systematic  random 
sample  (e.  g.,  [10],  Section  5.6.1)  of  800  locations  for  further  analysis  in  XGobi. 

At  the  time  we  analyzed  this  data,  XploRe  was  not  yet  accessible  through  a  link.  It  took  the  user  five 
individual  steps  to  create  a  new  variable  Clust  that  contains  the  result  of  a  hierarchical  cluster  analysis. 
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To  ease  the  access  to  analytical  statistical  methods  we  decided  to  add  XploRe  to  the  ArcView/XGobi  link. 
Today,  within  the  ArcView/XGobi/XploRe  environment,  these  steps  could  be  done  by  pressing  a  few  buttons 
in  ArcView  to  activate  the  required  functionality  within  the  other  two  programs. 

Figure  1  (a  reprint  from  [29]  and  [30])  shows  the  ArcView  view  (top)  and  two  XGobi  views  (bottom). 
We  brushed  two  of  the  clusters  in  the  dotplot  of  Clusi  (left  XGobi) .  We  see  that  these  clusters  are  two  of  the 
corners  of  the  data  triangle  in  the  scatterplot  of  the  variables  Icariacl  and  Icariac2  (right  XGobi).  In  the 
ArcView  view,  these  points  fall  on  lake  locations  (circles)  or  areas  with  active  vegetation  (filled  boxes).  The 
next  step  of  our  interactive  analysis  might  have  been  the  brushing  of  the  third  corner  of  the  data  triangle 
in  the  right  XGobi  to  see  if  these  points  also  relate  to  a  particular  geographic  region  in  the  map  view.  An 
accompanying  video  [28]  gives  additional  insight  into  this  interactive  and  dynamic  analysis. 


EXAMPLE  2:  BORDER  REGION  BETWEEN  VERMONT  AND  NEW  HAMPSHIRE 


Figure  2:  The  ArcView  map  view  (left)  shows  the  spatial  region  while  the  XGobi  view  (right)  shows  an 
interesting  projection  that  permits  to  distinguish  between  two  types  of  water.  The  cluster  of  points  (marked 
with  circles)  detected  in  XGobi  relates  to  the  middle  of  the  river  and  the  small  lake  in  the  upper  left  corner 
of  the  ArcView  map  view.  The  remaining  points  (marked  with  x ’s)  that  are  widely  spread  in  the  XGobi  view 
fall  onto  the  smaller  ponds  and  the  southern  border  of  the  river.  The  XGobi  variable  pane!  indicates  that 
bands  2,  4?  ^  major  contributors  to  this  projection. 

While  we  only  consider  main  classifications  such  as  water  and  vegetation  in  Example  1,  this  example 
demonstrates  that  water  is  not  always  equal  to  water  and  a  quarry  can  be  separated  from  areas  that  look 
similarly  in  the  visible  satellite  image.  This  time  we  have  measurements  from  6  spectral  bands  recorded 
by  the  Thematic  Mapper  (TM)  instrument  of  the  Landsat  earth  observation  satellite.  The  area  under 
consideration  relates  to  the  border  region  between  the  states  Vermont  and  New  Hampshire.  In  Figures  2 
and  3  we  look  at  two  very  small  segments  of  this  area.  The  ArcView  view  in  each  of  the  two  figures  is  based 


26 


Figure  3:  The  ArcView  map  view  (left)  shows  the  spatial  region  while  the  XGobi  view  (right)  shows  an 
interesting  projection  that  permits  to  distinguish  among  quarry  regions  (marked  with  x 's),  a  clearcut  area, 
and  clouds.  The  XGobi  variable  panel  indicates  that  bands  2  and  5  are  major  contributors  to  this  projection. 


on  bands  4,  3,  and  2,  only.  For  the  XGobi  views,  we  took  random  samples  of  points  from  known  origin. 
The  ArcView  images  have  been  processed  with  an  image  processing  program  after  the  analysis  to  yield  a 
better  gray  scale  representation.  On  a  computer  screen  you  would  see  red  symbols  on  a  mostly  green/blue 
background. 

In  Figure  2  samples  of  pixels  from  the  water  bodies  have  been  taken.  The  big  water  mass  that  stretches 
diagonally  across  the  ArcView  view  (left)  relates  to  the  Connecticut  river.  Three  smaller  water  bodies  are 
also  visible.  In  the  XGobi  view  (right)  we  see  a  concentration  of  circles  that  are  clustered  along  a  line  and 
a  large  number  of  x’s  that  are  scattered  in  the  plot.  In  the  map  view  we  see  that  the  circles  fall  onto  the 
middle  of  the  river  and  the  small  lake  in  the  upper  left  corner.  These  points  may  relate  to  factors  such  as  a 
strong  current,  deep  water,  and  no  overgrown  vegetation.  Otherwise,  the  x’s  fall  onto  two  smaller  ponds  and 
the  southern  border  of  the  river.  These  points  may  relate  to  shallow  water  with  algae  or  waterplants  in  the 
ponds  or  overgrown  vegetation  at  the  edge  of  the  river.  Note  that  there  are  no  x’s  at  the  northern  border  of 
the  river  —  an  indicator  that  there  is  no  vegetation  on  this  border.  Most  importantly,  the  visible  ArcView 
view  based  on  bands  4,  3,  and  2  does  not  allow  a  distinction  among  different  water  bodies.  However,  the 
XGobi  projection  allows  such  a  distinction  since,  as  the  XGobi  variable  panel  indicates,  band  5  which  is  not 
used  in  ArcView  (in  addition  to  bands  2  and  4)  is  a  major  contributor  to  this  projection. 

In  Figure  3  the  ArcView  view  (left)  shows  a  region  where  2  clouds,  a  clearcut  area,  and  two  quarry 
regions  have  been  classified  as  ground  truth.  In  the  XGobi  view  (right)  we  can  distinguish  three  regions 
of  points.  We  brushed  the  points  from  the  quarry  regions  with  an  x  in  the  ArcView  view  and  notice  that 
all  these  points  fall  into  one  of  the  three  regions.  However,  there  are  additional  points  in  this  region  that 
have  not  been  marked.  This  might  be  an  indicator  for  an  addional  quarry  region  that  has  been  differently 
classified  in  the  ground  truth.  Through  brushing  in  XGobi  we  would  also  see  where  these  points  are  located 
on  the  map.  Note  that  once  again  bands  5  and  2  are  major  contributors  to  the  XGobi  view  but  band  5  is 
not  a  part  of  the  ArcView  view. 
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EXAMPLE  3:  AGRICULTURAL  SCENE  FROM  THE  IMPERIAL  VALLEY 
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Figure  4:  The  Arc  View  map  view  shows  the  reported  crop  information  for  an  agricultural  scene  and  the 
sampling  locations  which  have  been  used  for  further  analysis  in  XGohi  and  XploRe.  Six  crops  are  identified 
from  the  ground.  For  fields  with  no  crops  or  mixtures  of  crops  no  information  has  been  recorded.  There  are 
7  bands  of  remotely  sensed  information  available  for  clustering  and  comparison  with  the  ground  truth. 


In  this  example  we  are  looking  at  data  from  7  bands  recorded  by  a  Landsat~4  TM  instrument  on  Decem¬ 
ber  12,  1982.  The  scene  under  consideration  represents  a  very  small  area  of  the  Imperial  Valley  in  California. 
In  fact,  we  only  consider  124  fields  with  known  crop  information  provided  by  the  Imperial  Valley  Irrigation 
Board.  See  [11]  and  [12]  for  additional  details  on  the  data  and  assessments  of  different  classification  tech¬ 
niques.  Figure  4  shows  the  known  crop  types  for  the  area  under  investigation.  It  is  a  reproduction  within 
ArcView  that  is  based  on  Figure  2  in  [12].  We  took  a  systematic  random  sample  of  314  locations  for  further 
analysis  in  XGobi  and  XploRe. 

As  usual,  we  make  use  of  XGobi's  grand  tour  feature  to  detect  clusters.  Figure  5  shows  such  a  cluster 
in  the  XGobi  view  (right).  The  points  that  have  been  brushed  (with  an  open  box)  fall  all  into  fields  of 
beets  in  the  ArcView  map  view  (left).  The  ArcView  map  suggests  that  some  points  in  the  beets  fields  have 
been  missed  through  the  XGobi  classification.  This  is  not  the  case.  As  Figures  3,  4,  and  5  in  [12]  indicate, 
there  seem  to  be  streets  bordering  each  field.  The  points  that  appear  to  be  located  within  fields  of  beets 
according  to  the  inprecise  ground  truth  used  in  ArcView  most  likely  fall  onto  streets,  thus  yielding  quite 
different  wavelength  measurements  (and  therefore  projections  in  the  XGobi  view)  than  points  from  the  fields 
of  beets. 

Before  the  next  step  we  deleted  points  from  XGobi  and  ArcView  that  have  been  classified  as  beets  in  the 
previous  step,  thus  we  are  conducting  a  hierarchical  analysis  in  this  example.  Figure  6  shows  another  cluster 
(brushed  with  an  open  box)  in  the  XGobi  view  (right)  that  relates  to  fields  of  alfalfa  in  the  ArcView  map 
view  (left).  However,  a  large  number  of  alfalfa  points  in  the  map  is  not  highlighted.  [12],  again,  provides  an 
explanation:  There  are  ^‘new’^  and  “old”  alfalfa  fields.  Unfortunately,  this  additional  information  was  not 
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Figure  5:  The  ArcView  map  view  (left)  shows  the  spatial  region  while  the  XGobi  view  (right)  shows  an 
interesting  projection  that  permits  to  distinguish  between  beets  (open  boxes)  and  other  crops  (filled  circles). 
Points  that  appear  to  fall  on  beets  fields  but  are  not  contained  in  the  XGobi  cluster  have  been  circled  in  the 
ArcView  map  view.  From  additional  sources  we  know  that  these  points  most  likely  fall  onto  streets  dividing 
or  surrounding  the  fields. 


available  for  our  analysis.  However,  it  seems  to  be  very  likely  that  our  given  projection  in  XGobi  allows  a 
distinction  between  “new”  and  “old”  alfalfa  fields. 

One  should  note  that  in  Figure  6  we  also  brushed  the  single  sampling  point  that  falls  into  the  broccoli 
field.  In  the  XGobi  view  the  corresponding  filled  box  appears  at  one  edge  of  the  alfalfa  cluster.  It  would  be 
interesting  to  have  a  larger  sample  of  points  from  the  broccoli  field  and  see  if  a  distinction  between  alfalfa 
and  broccoli  is  possible. 

In  our  interactive  environment  we  would  try  to  find  additional  clusters  in  XGobi  that  allow  to  distinguish 
among  the  other  type  of  alfalfa,  cotton  fields,  and  fallow  fields.  Since  we  have  only  two  sample  points  for 
onions,  it  is  very  likely  that  these  will  not  be  identified  as  an  additional  crop.  Taking  more  sample  points 
from  this  crop^s  field  would  probably  help  in  clustering  it. 

If  we  look  at  the  XGobi  variable  panels  in  Figures  5  and  6,  we  see  that  each  of  the  7  bands  has  a 
non— neglect  able  influence  on  at  least  one  of  the  visible  projections.  Any  classification  approach  that  tries  to 
eliminate  any  of  the  bands  will  most  likely  produce  a  less  precise  classification. 

In  addition  to  the  visual  approach  in  XGobi  presented  so  far  we  make  also  use  of  XploRe^s  clustering 
capabilities  to  analyze  this  data  set.  XploRe  supports  two  basic  clustering  methods,  i.  e.,  hierarchical 
clustering  and  partitioning  clustering.  Partitioning  methods  (kmeans  in  XploRe)  require  an  initial  partition. 
Other  Ar-means  clustering  methods,  like  adaptive  Ar-means  clustering,  are  also  available  in  XploRe, 

In  this  example  we  make  use  of  a  hierarchical  method.  For  this  approach  interpoint  distances  of  cluster 
centers  will  be  used  to  build  additional  clusters.  We  start  with  n  clusters  that  contain  one  observation  each 
and  successively  merge  clusters  together.  XploRe  allows  different  hierarchical  methods  like  single  linkage, 
complete  linkage,  average  linkage,  Centroid  linkage,  Ward  method,  and  Lance-  and  Wiliams  method.  For 
a  short  overview  of  these  methods  see  [16],  for  a  more  detailed  overview  see  [20],  and  for  an  overview  of 
methods  implemented  in  XploRe  see  [13]. 

To  use  the  hierarchical  method  we  have  to  select  a  distance  measure  and  one  of  the  previously  listed 
methods.  We  choose  the  Euclidean  distance  and  the  Ward  method.  The  Ward  method  minimizes  the 
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Figure  6:  The  ArcView  map  view  (left)  shows  ike  spatial  region  while  the  XGobi  view  (right)  shows  an 
interesting  projection  that  permits  to  distinguish  between  alfalfa  (open  boxes)  and  other  crops  (filled  circles). 
Alfalfa  fields  that  are  not  contained  in  the  XGobi  cluster  have  been  circled  in  the  ArcView  map  view.  There 
are  ""new”  and  ^old”  alfalfa  fields.  The  cluster  in  XGobi  most  likely  relates  to  only  one  of  these  types. 
Broccoli  (just  one  sample  point  represented  by  the  filled  box)  has  values  similar  to  alfalfa. 


‘‘within”  cluster  variances.  The  dendrograms  in  Figure  7  show  the  logarithm  of  the  sum  of  the  “within” 
cluster  variances  on  the  t/-axis.  We  see  at  which  levels  observations  and  clusters  are  merged  together  (bottom: 
all  313  observations^,  top:  one  cluster).  Since  the  variances  of  the  clusters  vary,  we  applied  clustering  on 
the  original  data  and  on  the  standardized  data.  Both  dendrograms  show  a  large  increase  of  the  sum  of  the 
“within”  cluster  variances  if  we  merge  the  last  two  clusters.  The  dendrogram  of  the  standardized  variables 
shows  also  an  increase  of  the  sum  of  the  “within”  cluster  variances  if  we  have  four  clusters. 

Since  we  know  the  clustering  based  on  ground  truth,  we  can  compare  the  results  of  the  clustering  algorithm 
(based  on  the  original  data)  and  the  true  groups  as  in  Table  1,  The  Ward  methods  behave  a  little  bit  worse 
on  the  standardized  data  (not  shown).  However,  Table  1  shows  that  with  two  clusters  (left  part)  just  group 
1  (beets)  is  somewhat  separated  from  the  other  crops.  An  increase  to  four  clusters  (right  part)  shows  that 
group  3  (alfalfa)  is  split  into  three  of  the  four  clusters.  This  is  not  surprising  since  we  already  know  that 
there  are  different  types  of  alfalfa.  Group  1  (beets)  is  separated  very  well  from  the  other  crops.  The  first 
cluster  does  not  only  contain  parts  of  group  3  (alfalfa)  but  also  groups  2  (cotton)  and  4  (fallow).  Obviously, 
much  more  than  four  clusters  are  required  to  distinguish  among  these  crops.  This  matches  the  results  of  [11] 
where  a  clustering  method  based  on  unsupervised  signatures  resulted  in  19  clusters.  Since  groups  5  (onions) 
and  6  (broccoli)  contain  only  2  and  1  observations,  respectively,  it  is  basically  impossible  to  detect  these  as 
separate  clusters. 

Figure  8  shows  some  good  bivariate  projections  based  on  the  partition  found  in  the  data.  The  two  cluster 
projection  (left)  shows  a  strong  linearity  within  the  data  whereas  the  four  cluster  projection  (right)  shows 
how  well  the  clusters  are  separated  in  the  data. 


^From  the  dataset  one  observation  was  deleted  since  the  measurement  of  all  channels  was  zero. 
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Figure  7:  The  dendrograms  for  the  original  data  (left)  and  standardized  data  (right)  indicate  how  clusters 
are  merged  together.  At  the  bottom  we  have  all  313  observations  and  on  top  we  have  one  cluster.  On  the 
y-axis  we  see  the  logarithm  of  the  sum  of  the  ^within”  cluster  variances. 


Table  1:  Comparison  of  clustering  (based  on  the  original  data)  and  the  ground  truth  based  on  two  clusters 
(left  pari)  and  four  clusters  (right  part).  The  numbers  1  to  6  on  top  of  the  table  stand  for  the  6  crops,  i.  e., 
beets,  cotton,  alfalfa,  fallow,  onions,  and  broccoli. 
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Figure  8:  Bivariate  projections  based  on  the  partition  found  in  the  data.  The  two  cluster  projection  (left) 
shows  a  strong  linearity  within  the  data  whereas  the  four  cluster  projection  (right)  shows  how  well  the  clusters 
are  separated  in  the  data. 
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SUMMARY  AND  FUTURE  DIRECTIONS 

In  this  paper  we  described  the  use  of  the  ArcView/XGobi/XploRe  software  environment  for  the  explo¬ 
ration  and  analysis  of  satellite  images.  Within  this  environment  we  are  capable  to  detect  interesting  features 
of  the  remote  sensing  imagery  due  to  its  appearance  in  a  series  of  dynamic  plots  in  XGobi.  We  can  also 
conduct  an  analytical  statistical  analysis  of  the  data  in  XploRe. 

Further  work  can  be  directed  towards  two  directions.  First,  one  should  consider  how  existing  classification 
techniques  of  remote  sensing  imagery  can  be  combined  with  the  visual  approach  highlighted  in  this  paper. 
What  is  the  gain  if  we  combine  these  methods,  how  many  additional  percent  of  pixels  do  we  classify  correctly? 

The  work  presented  in  this  paper  should  not  be  misunderstood.  We  do  not  claim  at  all  that  the  classifica¬ 
tion  through  dynamic  statistical  graphics  presented  here  is  better  than  any  existing  classification  technique. 
But  we  are  convinced  that  existing  classification  techniques  combined  with  the  graphical  approach  help  to 
produce  better  classification  results  in  an  easy  way.  This  should  be  further  investigated. 

Direction  two  deals  with  possibilities  for  improvements  of  the  software  environment.  For  example,  it  is 
absolutely  necessary  to  solve  one  restriction  of  this  environment,  i.  e.,  the  fact  that  only  one  of  the  graphical 
links  is  fully  functional  at  a  time.  For  example,  if  we  brush  points  in  the  XGobi  view,  they  are  also  brushed 
either  in  ArcView  or  in  XploRe  —  but  not  in  both  other  programs  at  the  same  time  what  one  would  expect 
from  such  a  linked  environment.  A  solution  has  been  designed  (see  the  part  on  “hierarchical”  linked  brushing 
in  [31]  and  [26])  but  it  has  not  been  implemented  yet. 

There  exist  other  possible  extensions  that  would  be  particularly  useful  for  the  exploration  of  satellite 
images.  As  we  have  seen  in  all  of  our  examples,  we  typically  only  use  a  very  small  subset  of  the  pixels  of  a 
satellite  image.  So  far,  we  used  individually  written  AVENUE  scripts  in  ArcView  to  do  this  sampling,  e.  g., 
systematic  random  sampling  or  stratified  sampling  bctsed  on  ground  truth  such  as  water,  road,  or  urban 
surfaces.  An  easily  accessible  collection  of  sampling  mechanisms  in  this  environment  would  be  ideal. 

Also  the  concepts  of  ‘‘geographic  brushing”  through  ArcView  (e.  g.,  brush  all  points  that  are  at  most 
1  km  away  from  a  known  water  source)  and  “statistical  brushing”  through  XploRe  (e.  g.,  brush  all  points 
where  the  greenness  index  —  a  function  based  on  measurements  of  some  of  the  satellite  bands  —  falls  into 
a  given  range  to  show  actively  growing  vegatation)  would  be  additional  useful  extensions  of  this  software 
environment  for  the  exploration  of  satellite  images  and  other  geographically  referenced  data. 

ACKNOWLEDGMENTS 

Symanzik’s  work  was  partially  supported  by  a  National  Science  Foundation  Group  Infrastructure  Grant 
DMS-9631351.  Additionally,  it  was  supported  by  the  Deutsche  Forschungsgemeinschaft,  Sonderforschungs- 
bereich  373  “Quantifikation  und  Simulation  okonomischer  Prozesse”,  Humboldt-Universitat  zu  Berlin.  The 
article  has  not  been  subjected  to  the  review  of  any  of  the  previously  mentioned  agencies  and  thus  does  not 
necessarily  reflect  the  view  of  any  of  these  agencies  and  no  official  endorsement  should  be  inferred.  We  would 
like  to  thank  Barry  Haack  and  Matthew  Bechdol  for  providing  us  with  the  satellite  image  used  for  Example  3. 

REFERENCES 

[1]  D.  Asimov.  The  Grand  Tour:  A  Tool  for  Viewing  Multidimensional  Data.  SIAM  Journal  on  Scientific  and 
Statistical  Computing^  6(1):128-143,  1985. 

[2]  A.  Buja  and  D.  Asimov.  Grand  Tour  Methods:  An  Outline.  In  D.  M.  Allen,  editor,  Proceedings  of  the  17th 
Symposium  on  the  Interface  between  Computer  Science  and  Statistics,  Lexington,  KY,  pages  63-67.  Elsevier, 
1986. 

[3]  A.  Buja,  C.  Hurley,  and  J.  A.  McDonald.  A  Data  Viewer  for  Multivariate  Data.  In  T.  J.  Boardman  and  I.  M. 
Stefanski,  editors,  Proceedings  of  the  18th  Symposium  on  the  Interface  between  Computer  Science  and  Statistics, 
Fort  Collins,  CO,  pages  171-174.  American  Statistical  Association,  Washington,  D.C.,  1986. 

[4]  A.  Buja,  J.  A.  McDonald,  J.  Michalak,  and  W.  Stuetzle.  Interactive  Data  Visualization  Using  Focusing  and 
Linking.  In  G.  M.  Nielson  and  L.  Rosenblum,  editors,  Proceedings  of  Visualization  ^91,  Los  Alamitos,  CA,  pages 
156-163.  IEEE  Computer  Society  Press,  1991. 

[5]  D.  B.  Carr.  Looking  at  Large  Data  Sets  Using  Binned  Data  Plots.  In  A.  Buja  and  P.  T.  Tukey,  editors,  Computing 
and  Graphics  in  Statistics,  pages  7-39.  Springer,  New  York,  NY,  1991. 


32 


[6]  D.  Cook,  A.  Buja,  J.  Cabrera,  and  C.  Hurley.  Grand  Tour  and  Projection  Pursuit.  Journal  of  Computational 
and  Graphical  Statistics,  4(3):  155-172,  1995. 

[7]  D.  Cook,  J.  J.  Majure,  J.  Symanzik,  and  N.  Cressie.  Dynamic  Graphics  in  a  GIS:  Exploring  and  Analyzing 
Multivariate  Spatial  Data  Using  Linked  Software,  Computational  Statistics,  ll(4):467-480,  1996. 

[8]  D.  Cook,  J.  Symanzik,  J.  J.  Majure,  and  N.  Cressie.  Dynamic  Graphics  in  a  GIS:  More  Examples  Using  Linked 
Software.  Computers  and  Geosciences:  Special  Issue  on  Exploratory  Cartographic  Visualization,  23(4):371-385, 
1997.  Paper,  CD,  and  http://www.elsevier.nl/locate/cgvis. 

[9]  J.  R.  Corbin.  The  Art  of  Distributed  Applications:  Programming  Techniques  for  Remote  Procedure  Calls. 
Springer,  New  York,  Berlin,  Heidelberg,  1991. 

[10]  N.  A.  C.  Cressie.  Statistics  for  Spatial  Data  (Revised  Edition).  Wiley,  New  York,  NY,  1993. 

[11]  B.  Haack  and  S.  Jampoler.  Agricultural  Classification  Comparisons  Using  Landsat  Thematic  Mapper  Data.  ITC 
Journal,  1994(2):113-118,  1994. 

[12]  B,  Haack  and  S.  Jampoler.  Colour  Composite  Comparisons  for  Agricultural  Assessments.  International  Journal 
on  Remote  Sensing,  16(9):  1589-1598,  1995. 

[13]  W.  Hardle,  S.  Klinke,  and  B.  A.  Turlach.  XploRe:  An  Interactive  Statistical  Computing  Environment  Springer, 
New  York,  Berlin,  Heidelberg,  1995. 

[14]  C.  Hurley.  A  Demonstration  of  the  Data  Viewer.  In  E.  J.  Wegman,  D.  T.  Gantz,  and  J.  J.  Miller,  editors, 
Proceedings  of  the  20th  Symposium  on  the  Interface  between  Computing  Science  and  Statistics,  Fairfax,  VA, 
pages  108-114,  American  Statistical  Association,  Alexandria,  VA,  1988. 

[15]  R.  Klein  and  R.  1.  Moreira.  Exploratory  Analysis  of  Agricultural  Images  via  Dynamic  Graphics.  Technical 
Report  9/94,  Laboratorio  Nacional  de  Computagao  Cientifica,  Rio  de  Janeiro,  Brazil,  1994. 

[16]  S,  Klinke.  Data  Structures  for  Computational  Statistics.  Physica-Verlag,  Heidelberg,  1997. 

[17]  S.  Klinke  and  T.  Kotter.  XploRe  4  —  A  Statistical  Computing  Environment.  In  F,  Faulbaum  and  W.  Bandilla, 
editors,  SoftStat  ^95  Advances  in  Statistical  Software  5,  pages  113-122,  Stuttgart,  1996.  Lucius  Lucius. 

[18]  T.  Kotter.  Development  of  XploRe  4:  The  Programming  Language.  In  A.  Prat  and  E.  RipoU,  editors,  Compstat 
-  Proceedings  in  Computational  Statistics,  Short  Communications  and  Posters,  pages  201-202,  1996. 

[19]  J,  A.  McDonald  and  S.  Willis.  Use  of  the  Grand  Tour  in  Remote  Sensing.  ASA  Statistical  Graphics  Video 
Lending  Library  (contact:  dfs@research.att.com),  1987. 

[20]  H.  J.  Mucha.  Clusteranalysis  auf  Mikrocomputern.  Akademie  Verlag,  Berlin,  1992. 

[21]  J.  D.  Saleh  and  D.  W.  Scott.  Data  Exploration  with  the  Density  Grand  Tour.  Statistical  Computing  and 
Statistical  Graphics  Newsletter,  8(1):7-11,  1997. 

[22]  S.  Schmelzer,  T.  Kotter,  S.  Klinke,  and  W.  Hardle.  A  New  Generation  of  a  Statistical  Computing  Environment 
on  the  Net.  In  A.  Prat,  editor,  Compstat  -  Proceedings  in  Computational  Statistics,  pages  135-148,  Heidelberg, 
1996.  Physica-Verlag. 

[23]  D-  W.  Scott.  Data  Analysis  in  Three  and  Four  Dimensions  with  Nonparametric  Density  Estimation.  In  E.  J. 
Wegman  and  D.  J.  DePriest,  editors,  Statistical  Image  Processing  and  Graphics,  pages  291-305.  Marcel  Dekker, 
New  York,  NY,  1986. 

[24]  W.  R.  Stevens.  UNIX  Network  Programming.  Prentice-Htiil,  Englewood  Cliffs,  NJ,  1990. 

[25]  D.  F.  Swayne,  D.  Cook,  and  A.  Buja.  XGobi:  Interactive  Dynamic  Graphics  in  the  X  Window  System.  Journal 
of  Computational  and  Graphical  Statistics,  1998.  Forthcoming. 

[26]  J.  Symanzik,  S.  Klinke,  S.  Schmelzer,  D.  Cook,  and  N.  Lewin.  The  ArcView/XGobi/XploRe  Environment: 
Technical  Details  and  Applications  for  Spatial  Data  Analysis.  ASA  Proceedings  of  the  Section  on  Statistical 
Graphics,  1997.  Forthcoming. 

[27]  J.  Symanzik,  T.  K5tter,  S.  Schmelzer,  S.  Klinke,  D.  Cook,  and  D.  Swayne.  Spatial  Data  Analysis  in  the  Dynam¬ 
ically  Linked  ArcView/XGobi/XploRe  Environment.  Computing  Science  and  Statistics,  29,  1997.  Forthcoming. 

[28]  J.  Symanzik,  J.  J.  Majure,  and  D.  Cook.  Dynamic  Graphics  in  a  GIS:  Analyzing  and  Exploring  Multivariate 
Spatial  Data.  ASA  Statistical  Graphics  Video  Lending  Library  (contact:  dfs@research.att.com),  1995. 

[29]  J.  Symajizik,  J.  J.  Majure,  and  D.  Cook,  Dynamic  Graphics  in  a  GIS:  A  Bidirectional  Link  between  Arc  View 
2.0  and  XGobi.  Computing  Science  and  Statistics,  27:299-303,  1996. 

[30]  J.  Symanzik,  J,  J.  Majure,  and  D.  Cook.  Dynamic  Graphics  in  a  GIS:  A  Bidirectional  Link  between  AreView 
2.1  and  XGobi  —  An  Update.  Computing  Science  and  Statistics,  29(2):35-40,  1997. 

[31]  J.  Symanzik,  J.  J.  Majure,  D.  Cook,  and  1.  Megretskaia.  Linking  AreView  3.0  and  XGobi:  Insight  Behind  the 
Front  End.  Technical  Report  97-10,  Department  of  Statistics,  Iowa  State  University,  Ames,  Iowa,  1997. 


33 


VISUAL  EXPLORATION  OF  SPATIAL  DATA  WITH  MANET 


Adalbert  F.X.  Wilhelm* 
Center  for  Computational  Statistics 
George  Mason  University 
Fairfax,  Virginia  22030 


ABSTRACT 

Manet  is  research  software  developed  at  the  Department  for  Computeroriented  Statistics  and  Data 
Analysis  at  the  University  of  Augsburg.  Manet  offers  visual  exploration  of  data  sets.  It  is  based  on 
the  paradigm  of  linking  low-dimensional  views  in  a  highly  interactive  environment.  Two  main  features 
of  Manet  are  the  consistent  treatment  of  missing  data  in  visualization  and  the  link  between  graphic 
representations  of  geographic  space  and  the  graphic  representations  of  attribute  space.  This  paper  shows 
various  examples  how  this  link  can  be  used  to  explore  data  with  underlying  geographic  information. 

INTRODUCTION 

Spatial  data  are  widely  used  to  demonstrate  new  developments  in  interactive  statistical  graphics.  This  is 
not  a  mere  coincidence  but  the  consequence  of  the  extent  to  which  spatial  data  and  interactive  graphics 
can  supplement  each  other.  Spatial  data  are  often  characterized  as  being  “data  rich  but  theory  poor”^ 
and  they  often  do  not  meet  the  standard  assumptions  necessary  to  perform  a  confirmatory  analysis. 
Interactive  graphics  are  based  on  the  philosophy  of  exploratory  data  analysis  (EDA)  established  by 
Tukey  and  his  scholars.  In  EDA  the  data  should  speak  for  themselves  or  in  Tukey’s  words  “EDA  is 
looking  at  data  to  see  what  it  seems  to  say”".  The  main  goal  in  EDA  is  to  detect  patterns,  to  produce 
hypotheses  and  to  identify  potentially  misleading  or  influential  observations,  such  as  outliers  or  leverage 
points.  All  these  premises  fit  well  to  the  analysis  of  spatial  data.  In  addition,  geographers  are  used  to 
work  with  graphical  representations  of  their  data,  like  maps  and  cartograms. 

It  is  therefore  not  surprising  that  a  series  of  work  has  been  done  that  tries  to  connect  interactive  sta¬ 
tistical  graphics  and  spatial  data  analysis.  Three  different  branches  of  software  developments  accompany 
these  efforts:  integrating  spatial  ideas  in  interactive  statistical  software,  integrating  interactive  graphics 
in  spatial  analysis  software  and  linking  Geographic  Information  Systems  to  dynamic  graphics  software. 
Examples  for  the  latter  are  the  XGobi/ArcView  Link^  and  the  SpaceStat/ArcView  Link^.  The  second 
approach  is  pursued  in  the  development  of  CDV®,  the  first  one  in  Regard®  (formerly  Spider)'^  and  in 
Manet®. 

Typically,  integrated  systems  are  considerably  faster  in  execution  than  coupling  approaches  and  they 
provide  a  more  seamless  environment  which  is  so  crucial  for  a  highly  interactive  exploration. 

Manet  has  three  components  with  a  mainly  spatial  operational  field  (the  polygon  map,  the  trace 
function,  and  weighted  plots)  and  it  is  strongly  influenced  by  the  experiences  made  with  Regard. 

Interactive  statistical  graphics  resulted  from  the  merge  of  static  data  representation  used  for  ex¬ 
ploratory  data  analysis  and  inexpensive  graphics-capable  desktop  computers.  But  interactive  graphics  is 
much  more  than  just  creating  a  graphic  on  the  computer  screen.  The  possibility  of  user  interaction  with 
instantaneous  response  opens  a  completely  new  way  of  looking  at  data.  Whereas  the  goal  in  cartography 
and  static  graphics  is  to  show  in  one  plot  as  much  information  as  possible,  interactive  graphics  keep 
the  displays  simple  by  hiding  information  that  is  not  essential  for  interpretation  until  it  is  specifically 
requested. 

As  yet,  there  does  not  exist  a  precise  definition  of  interactive  statistical  graphics.  Often  researchers 
use  the  notions  dynamic  graphics,  direct  manipulation  graphics  and  interactive  graphics  interchangeably. 
Eick  and  Wills®  define  “Interactive  Graphical  Methods  as  the  class  of  techniques  for  exploring  data  that 
allow  the  user  to  manipulate  a  graphical  representation  of  the  data” .  Moreover  they  define  an  Interactive 


*  On  leave  from  Universitat  Augsburg,  Germany. 
Approved  for  public  release;  distribution  is  unlimited. 


34 


Graphic  View  as  a  pictorial  representation  of  some  form  of  data  or  information  which  the  analyst  can 
manipulate  in  real  time”®.  Current  statistical  software  packages  offer  at  least  scatterplot  brushing  and 
3-D-rotation  as  the  main  interactive  procedures,  but  a  much  bigger  variety  of  features  has  been  proposed 
in  the  literature  and  made  available  as  prototypes.  Interactive  graphics  are  indispensable  for  acquisition 
of  qualitative  insights  into  the  data  sets,  for  studying  model  residuals  as  well  as  for  revealing  quantitative 
results. 

The  principal  concepts  to  make  a  single  graphic  interactive  are: 

Changing  Projection  Views  Paper  and  screen  are  restricted  to  two  dimensions  and  the  human  eye- 
system  is  trained  for  the  three-dimensional  world  only.  Dimension  reduction  techniques  are  used  to 
guarantee  quick  and  easy  perception.  To  regain  some  of  the  multi-dimensionality  of  the  data  set  a 
rapid  dynamic  and  smooth  change  of  the  projection  views  must  be  provided. 

Rescaling  Perception  of  graphical  displays  strongly  depends  on  the  scale.  Since  there  are  no  unique 
choices,  statistical  software  should  provide  the  user  with  tools  to  flexibly  change  plot  scales. 

Interrogation  Graphics  should  not  be  overloaded.  On  demand  additional  information  must  be  available 
directly  from  graphics. 

Selection  Selecting  subgroups  and  focusing  on  specific  data  points  helps  to  reveal  structure  in  the  data 
set.  A  wide  variety  of  tools  to  select  groups  of  points  from  graphical  representations  is  needed  to 
perform  a  sophisticated  analysis. 

Linking  Full  interactivity  is  only  achieved  when  selection  is  not  restricted  to  a  single  display  but  prop¬ 
agated  to  other  plots.  This  means  that  all  displays  are  connected  and  that  each  view  of  the  data 
shows  each  case  consistently.  Linking  is  the  key  concept  of  interactive  statistical  graphics,  it  builds 
up  a  relation  between  measurements  of  various  variables,  between  different  graphical  representa¬ 
tions  as  well  as  between  raw  data  and  some  models.  These  links  can  also  perform  different  functions 
-  the  standard  one  is  highlighting,  color  encoding  or  hiding  are  others. 

For  all  these  interactions  an  instantaneous  response  is  crucial.  As  Eick  and  Wills  point  out  this  means 
that  a  “response  time  of  50ms  or  less  is  required”®. 

Two  different  conceptual  approaches  exist  for  interactive  statistical  graphics:  single  window  linking 
and  multiple  window  linking.  The  first  concept  requires  that  all  linked  plots  are  arranged  in  one  single 
window.  Therefore,  linking  is  often  unnecessarily  restricted  to  array  arrangements  of  the  same  plot 
type,  e.g.  scatterplot  matrices  and  trellis  displays^®.  In  the  multiple  window  approach  each  display  lives 
independently,  but  they  all  inherent  selection  information  from  the  case  labels. 

For  spatial  data  the  paradigm  of  multiple  windows  linking  is  very  appealing.  It  seems  quite  natural 
to  combine  the  geographic  location  in  form  of  a  map  to  the  graphical  representations  of  the  variable 
measurements.  But  in  fact,  maps  have  only  been  added  quite  recently  as  an  additional  view  of  the  data, 
see  Monmonier^l,  Haslett  et  al  7  and  MacDougall^2. 

Linking  low-dimensional  views  is  particularly  helpful  for  the  standard  questions  of  exploratory  data 
analysis,  like  searching  for  outliers  and  clusters,  or  investigating  distributional  properties  like  symmetry 
or  modes.  In  contrast  to  high-dimensional  rotating  point  clouds  or  parallel  coordinate  plots,  linking 
low-dimensional  views  allows  to  use  displays  like  histograms  and  boxplots  that  convey  the  marginal 
distributional  properties  much  better.  Interpretation  of  linked  views  is  typically  much  easier  than  for 
complicated  high-dimensional  plots  since  the  user  controls  the  process  of  investigation.  But  still  a  lot  of 
the  multivariate  structure  can  be  found. 


INTERACTIVE  MAPS 

The  key  tool  to  enhance  interactive  graphics  software  to  the  analysis  of  spatial  data  is  to  add  a 
map  tool.  Data  with  point  locations  can  simply  be  handled  by  drawing  a  scatterplot  for  longitude  and 
latitude,  but  many  spatial  data  sets  are  based  on  regions.  Manet  is  able  to  deal  with  such  data  by 
drawing  a  polygon  plot.  How  can  we  make  a  polygon  map  interactive?  Changing  projection  views 
could  be  interpreted  as  using  different  projection  techniques  to  map  the  threedimensional  locations  into 
planar  coordinates.  A  less  sophisticated  approach  is  to  allow  different  representations  of  the  same  map. 
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This  is  done  in  Manet  where  the  user  can  switch  between  a  filled  polygon  map,  a  hollow  polygon  map 
and  a  choropleth  map,  see  Figure  1.  Switching  from  one  representation  to  another  is  easy  and  can  be 
performed  by  simple  mouse  clicks.  The  filled  representation  is  more  effective  in  showing  the  global  spatial 


Figure  1:  The  default  polygon  map  is  a  filled  one  (left).  Changing  plot  parameters  draws  only  the 
boundaries  (center).  Dragging  and  dropping  a  variable  over  the  plot  changes  it  to  a  choropleth  map 
(right). 

distribution  but  it  is  highly  affected  by  the  size  of  the  regions.  Perception  of  spatial  distributions  shown 
in  choropleth  maps  depends  heavily  on  the  scale  used.  In  MIanet  two  sliders  are  available  to  transform 
the  data  towards  a  more  appropriate  scaling.  A  preview  of  the  color  distribution  makes  it  easy  to  find  a 
transformation  that  results  in  a  more  informative  choropleth  map,  see  Figure  2.  The  distraction  is  here 
mainly  caused  by  one  outlier  with  a  multi-million  population:  Chicago  -  that  shows  up  as  a  white  spot. 
The  transformed  map  still  shows  Chicago  as  an  outlier,  but  it  shows  the  variability  in  the  bulk  of  the  data 
much  better.  In  addition,  a  second  outlier  appears  at  the  lower  end  of  the  population  scale:  Keweenaw, 
the  county  that  is  farthest  north. 

Transformations  can  also  be  used  to  reverse  the  color  table.  The  default  assigns  white  for  the  highest 
values  and  black  for  the  lowest,  but  depending  on  the  context  interpretation  might  be  easier  or  the  display 
might  be  more  consistent  with  a  reversed  color  table. 

The  choropleth  map  is  not  restricted  to  continuous  variables.  Discrete  variables  are  handled  in  the 
same  way  resulting  in  what  is  often  called  a  chorochromatic  or  fe-color  map. 

Manet  treats  the  map  as  any  other  graphical  data  display.  Therefore,  all  facilities  for  user  interaction 
provided  for  statistical  graphs,  such  cis  interrogation,  are  also  available  for  the  map. 

LINKING  MAPS  AND  STATISTICAL  GRAPHICS 

Scatterplot  brushing  is  today  a  fairly  well-known  exploratory  visual  technique.  But  the  idea  of  brushing 
is  not  restricted  to  scatterplot  matrices  only.  There  are  quite  a  few  software  packages  available  that 
offer  brushing  for  all  kind  of  statistical  graphics,  like  bar  charts,  histograms  or  box  plots.  From  this 
point  of  view  it  is  straightforward  to  link  the  geographic  information  displayed  in  a  map  to  the  attribute 
information  displayed  in  a  graphic.  In  Manet  the  polygon  map  is  fully  linked  to  all  attribute  displays. 
For  any  subsets  of  areas  selected  on  the  polygon  map  the  distributions  of  the  corresponding  observations 
are  highlighted  in  all  non-spatial  views.  Similarly,  for  any  subset  of  data  selected  in  a  statistical  graph, 
such  as  a  category  in  a  bar  chart,  the  corresponding  areas  are  highlighted  on  the  map. 

The  basic  application  of  that  feature  is  to  focus  on  interesting  subsets  and  to  compare  their  char¬ 
acteristics  with  either  the  entire  data  set  or  the  complementary  subset.  Methodologically,  there  is  no 
difference  whether  the  selection  is  made  on  the  map  or  on  a  statistical  graph.  In  Figure  3  counties  in  the 
Midwest  region  in  which  a  high  proportion  of  elderly  people  is  known  to  be  poor  are  selected  (see  boxplot 
on  the  right).  Immediately,  the  same  counties  are  highlighted  in  all  other  plots.  The  boxplots  for  the  age 
categories  0-17  and  18-59  show  that  poverty  is  -  as  expected  -  a  family  related  problem  that  affects  all 
generations.  More  surprisingly,  however,  is  the  fact  that  poverty  has  a  specific  spatial  distribution.  All 
the  selected  counties  lie  in  the  south  of  our  region  of  investigation. 
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Figure  2:  On  the  top  is  the  default  choropleth  map  and  the  corresponding  frequency  distribution.  Chang¬ 
ing  the  sliders  in  the  preview  leads  to  an  improved  map  shown  on  the  bottom  left. 


Subset  selections  can  also  be  based  on  both  geographic  and  attributive  information  at  the  same  time. 
In  Manet  a  series  of  selections  can  be  composed  with  logical  operations  and  all  steps  of  such  sequences  are 
stored  and  can  be  changed  individually.  Thus,  quite  elaborate  data  queries  can  be  performed  graphically 
and  various  subgroups  can  be  checked  effectively^^. 

Tracing  is  a  special  kind  of  generalized  brushing  proposed  by  Craig,  Haslett,  Unwin  and  Wills^^.  It 
computes  statistics  like  the  mean,  the  span,  or  the  standard  deviation,  of  specified  variables  for  all  points 
covered  by  the  brush  rectangle.  As  the  brush  moves  over  the  window  the  computations  are  updated  and 
the  results  are  displayed  in  a  new  view  in  a  time  series  plot  format  (see  Figure  4) .  Tracing  is  particularly 
useful  for  classifying  variables  and  detecting  spatial  dependence.  Linking  in  trace  plots  does  not  reflect 
a  one-to-one  correspondence  but  either  a  one-to-m  or  a  m-to-rn  correspondence.  Selecting  a  county  in 
the  map  will  highlight  all  calculations  in  the  trace  plot  that  use  the  county’s  measurements.  Selecting  a 
point  in  the  trace  plot  will  highlight  all  counties  in  the  map  that  contribute  to  this  value.  In  addition,  all 
other  points  in  the  trace  plot  that  also  make  use  of  the  highlighted  counties’  measurements  are  selected. 

Spatially  referenced  area  data  is  typically  based  on  politically  defined  regions.  Bar  charts  or  his¬ 
tograms  that  reflect  the  number  of  regions  falling  in  a  particular  class  are  often  misleading  and  bear  low 
information.  Instead,  some  demographic  figures,  like  total  population  or  population  density,  might  be 
more  appropriate  to  use  as  weight  for  each  bin  area  than  the  bare  number  of  regions.  In  Manet  weighting 
is  possible  for  bar  charts,  histograms  and  mosaic  plots  and  it  is  performed  by  simply  multiplying  each 
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Figure  3:  On  the  left  hand  is  a  polygon  map  of  the  counties  in  Bavaria;  on  the  right  hand  is  the  trace 
plot  for  four  variables  on  kind  of  land  usage. 


Figure  4:  On  the  left  hand  is  a  polygon  map  of  the  counties  in  Bavaria;  on  the  right  hand  is  the  trace 
plot  for  four  variables  on  kind  of  land  usage. 


case  of  the  displayed  variable  with  the  corresponding  value  of  the  weight  variable. 

Weighted  versions  are  available  for  histograms,  bar  charts  and  mosaic  plots.  The  plotted  areas  do  not 
reflect  the  number  of  counts  in  each  class,  as  would  be  done  by  the  standard  plots  of  this  type,  but  the 
areas  reflect  the  amount  of  another  variable  measured  at  the  objects  that  fall  in  a  class,  see  Figure  5.  In 
many  surveys  such  weighted  plots  help  to  adjust  results  and  to  avoid  false  impressions  that  are  mainly 
caused  by  a  specific  structure  in  the  underlying  sample  space. 

CONCLUSION 

Linking  low-dimensional  views  is  an  easy  interpretable  approach  to  analyze  multivariate  data.  This 
approach  can  be  straightforwardly  extended  to  spatial  data  by  adding  a  map  to  the  plot  toolbox.  Highly 
interactive  environments  give  full  support  to  the  human  being  who  is  still  the  best  pattern  recognizer.  In 
contrast  to  linkages  between  CIS  and  statistical  software,  the  development  of  integrated  software  is  easier 
to  use,  it  offers  are  more  seamless  transition  from  non-spatial  to  spatial  problems,  and  -  most  important  - 
it  is  much  faster.  In  addition,  low-dimensional  views,  like  bar  charts  and  histograms,  are  also  familiar  to 
researchers  and  analysts  who  are  not  specialized  in  statistics.  Therefore,  communication  is  much  easier 
to  clients  when  using  these  types  of  plots.  Further  information  on  the  MANET-Project  can  be  found  on 
the  World  Wide  Web  under  URL 
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Figure  5:  On  the  left  hand  we  show  a  standard  bar  chart  where  the  area  of  the  bars  represents  number  of 
counties  in  certain  states  in  the  Midwest,  whereas  on  the  right  hand  the  area  reflects  the  total  population 
in  those  counties. 


http://wwwl.math.uni-augsburg.de/  manet/ 

Future  work  will  concentrate  on  extensions  of  the  linking  paradigm  to  different  types  of  spatial  location. 
Various  data  matrices  have  then  to  be  connected  together.  In  the  hierarchical  case  the  procedure  is 
straightforward.  Whenever  a  region  is  highlighted  that  is  higher  in  the  hierarchy  all  areas  that  are 
included  in  that  region  are  highlighted  too.  If  an  area  is  highlighted  the  region  containing  it  is  partially 
highlighted  either  by  using  grey  scale  or  by  partitioning  the  region. 
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ABSTRACT 

This  paper  deals  with  the  control  of  radio  communications  in  a  battlefield  -  in  particular,  of  communications 
among  scouting  robots.  It  is  assumed  that  the  robots  operate  in  groups  and  exchange  over  a  common  radio 
channel  coded  messages  about  their  states,  positions,  tasks,  and  observations.  Control  of  the  communications  is 
needed  to  reduce  message  collisions,  and  consists  of  a  control  of  access  to  the  chaimel.  To  reduce  vutaerability 
die  control  should  be  distributed  among  the  robots  such  that  each  robot  independently  controls  its  access  in  a 
manner  that  enhances  the  overall  information  throughput.  Inputs  for  such  a  control  are  approximate  statistics  of 
the  network  status  that  are  obtained  by  passive  monitoring  of  the  message  traffic.  The  control  rules  are  heuristic 
because  the  controlled  process  does  not  have  a  set  point.  These  circumstances  suggest  the  use  of  fiizzy-logic 
control  procedures.  The  author  has  developed  such  control  procedures  and  tested  their  behavior  on  a  computer 
model  of  battlefield  communications. 


INTRODUCTION 

A  typical  communications  network  in  a  battlefield  consists  of  a  moderate  number  of  noiles  that  are 
broadcasting  on  a  single  low-bandwidffi  radio  channel.  Problems  arise  when  more  than  one  node  tries  to  access 
the  channel  at  the  same  time.  The  ensuing  message  collisions  can  cause  a  breakdown  of  communications  just  at 
those  times  when  information  exchange  is  important. 

To  ensure  smooth  communication  and  to  enhance  the  information  throughput  rate,  a  control  of  network 
access  is  mandatory.  However,  a  hierarchical  control  is  not  practical  in  a  battlefield  environment  for  two 
reasons.  First,  the  status  of  the  network  changes  dynamically.  Therefore,  a  regulating  node  must  be  constantly 
supplied  with  information  about  the  present  status.  Such  information  gathering  would  use  up  broadcasting  time, 
and  the  received  information  might  be  outdated  on  arrival.  Second,  the  concentration  of  control  in  a  single  node 
makes  the  communications  network  more  vulnerable. 

These  difficulties  can  be  avoided  by  using  a  distributed  control  where  each  node  listens  to  the  network  traffic 
and  makes  independent  decisions  for  accessing  the  network.  This  is  possible  because  in  a  modem  battlefield 
communication  system,  the  nodes  are  computers  and  the  communication  is  in  digital  form  among  these 
computers.  That  is,  each  node  has  ample  computing  power  to  analyze  the  network  traffic  and  make  decisions. 

.The  inputs  for  the  control  procedures  are  obtained  by  passive  listening  that  provides  only  approximate,  albeit 
up-to-date,  information  about  the  network  stams.  Therefore,  the  control  rules  must  be  of  such  type  that  allows 
approximate  input.  Also,  the  goal  of  the  control  is  defined  only  approximately  as  "enhance  the  information 
throughput"  The  system  has  no  set  point  and  control  mles  that  are  based  on  error  terms  are  not  applicable.  This 
suggests  the  use  of  fiizzy-logic  control  rales.  This  paper  describes  such  control  procedures  that  have  been 
developed  at  the  U.S.  Army  Research  Laboratory. 


Approved  for  public  release;  distribution  is  unlimited. 
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NETWORK  ACCESS  ALGORITHM 

We  are  seeking  access  control  procedures  with  the  following  properties: 

1)  All  nodes  have  equal  chances  for  access. 

2)  Information  throughput  is  enhanced. 

3)  Priority  messages  have  higher  access  chances. 

An  algorithm  that  achieves  these  goals  can  be  constructed  around  the  following  broadcast  procedure.  A  monitor 
at  each  node  constantly  listens  to  the  radio  channel  and  informs  the  access  manager  program  when  the  channel  is 
free.  At  that  time,  t,  the  manager  determines  which  message  from  its  message  queue  should  be  broadcast, 
chooses  a  random  time  interval.  A,  from  a  prescribed  access  delay  interval  range  (0,D),  and  sets  a  tentative 
broadcast  time,  t  +  A,  for  the  message.  If  the  channel  is  still  free  at  the  set  time,  then  the  message  is  broadcast. 
Otherwise,  the  manager  aborts  the  procedure  and  waits  for  the  next  free  time  signal. 

Obviously,  the  procedure  satisfies  the  first  property  if  all  nodes  use  the  same  access  delay  interval,  D.  The 
procedure  would  also  prevent  message  collisions,  if  the  response  time  of  the  nodes  would  be  zero  and  the 
propagation  speed  of  radio  signals  infinite.  In  practice,  however,  the  response  times  are  finite,  and  it  is  estimated 
that  messages  from  different  nodes  will  be  broadcast  simultaneously  and  collide  if  their  set  times  t  +  A,  differ  by 
less  than  0.5  s.  (A  typical  message  length  is  of  the  order  of  1  to  10  s.)  Collisions  reduce  the  information 
throughput  rate  because  collided  messages  must  be  repeated.  The  probability  of  collisions  is  reduced  by 
increasing  the  size  of  the  access  delay  interval  D.  On  the  other  hand,  a  large  D  means  long  idle  times  for  the 
networic  and  the  idle  time  is  reduced  by  reducing  D.  Hence,  there  exists  an  optimal  D  that  corresponds  to  a 
maximum  throughput  rate  and  somehow  depends  on  the  size  and  state  of  the  network.  The  goal  of  the  control 
procedure  is  to  find,  in  real  time,  an  optimal  D  from  information  that  can  be  obtained  by  monitoring  the  network 
traflSc. 

The  third  property  is  taken  care  of  by  reducing  the  global  D  for  high-priority  messages  and  incfeasing  it  for 
low-priority  messages.  The  priorities  are  assigned  automatically  by  the  message-generating  programs.  (Typical 
information  exchanged  by  scouting  robots  consists  of  data  about  the  robots’  positions,  states,  and  current  tasks; 
descriptions  of  observed  targets;  and  changes  of  environments  in  the  vicinities  of  the  robots.) 

INPUT  INFORMATION 

The  state  of  a  battlefield  communications  network  is  completely  described  by  the  number  and  state  of  the 
nodes,  the  message  queue  at  each  node,  and  the  characteristics  of  external  noise.  However,  the  current  values  of 
these  state  parameters  are  not  available  to  the  access  managers  at  the  nodes.  Instead,  the  netwo±  conditions 
must  be  inferred  from  approximate  information  that  is  obtained  by  listening  to  the  network  traffic.  We  have 
chosen  three  groups  of  netwoik  parameters  that  can  be  obtained  by  listening.  These  parameters  represent  time 
averages  of  observations  during  a  listening  time  interval,  L,  prior  to  the  reference  time  t.  One  group  consists  of 
the  relative  usage  time  of  the  radio  channel  during  the  time  interval  (t  -  L  ,  t)  in  three  usage  categories.  A 
second  group  consists  of  the  relative  number  of  network  accesses  during  the  same  time  interval  in  the  three 
categories.  A  third  group  consists  of  average  message  lengths  in  the  three  categories.  The  categories  of  network 

usage  time  are: 

1 .  Idle  time. 

2.  Successful  transmissions. 

3.  Collided  transmissions. 

For  effective  control,  one  also  needs  predictions  about  the  expected  status  of  the  network.  Such  predictions  can 
be  obtained  from  the  trends  of  the  observed  parameters  that  usually  are  obtained  by  numerical  differentiation. 
However,  in  the  present  problem,  simple  numerical  differentiation  cannot  be  used  because  of  the  oscillatory 
character  of  the  data.  Therefore,  trend  indicators  were  obtained  by  subtracting  from  the  simple  averages  of  die 
parameters  corresponding  weighted  averages  over  the  same  averaging  interval  L  with  linearly  decreasing 
weights. 
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The  length  L  of  the  listening  interval  cannot  be  chosen  arbitrarily.  An  excessively  short  length  provides  only 
useless  data  about  the  instant  channel  status,  while  an  averaging  over  an  excessively  large  L  is  not  sufficiently 
responsive  to  changing  trends  of  the  averaged  parameters.  Therefore,  the  access  controller  also  controls  the 
listening  and  averaging  interval  L  by  adjusting  it  as  necessary  if  network  conditions  change. 


CONTROL  RULES 

The  network  access  algorithm  is  controlled  by  the  length  D  of  the  access  delay  interval.  The  choice  of  D  is 
in  turn  influenced  by  the  length  L  of  the  monitoring  and  averaging  interval.  The  procedures  for  the  control  of 
these  two  access  parameters  were  formulated  in  terms  of  fuzzy-logic  rules.  In  this  section,  we  provide  a  short 
description  of  these  rules.  A  more  detailed  description  and  discussion  is  found  in  reference  [2]. 

CONTROL  OF  MONITORING 

The  proper  size  of  the  monitoring  and  averaging  interval  L  was  found  by  experiments  to  be  about  40  to  50 
times  the  average  length  of  transmitted  messages.  (These  and  other  experiments  were  done  with  a  computer 
model  of  battlefield  networks  [1].)  To  accommodate  changing  network  conditions,  L  was  controlled 
dynamically;  fliat  is,  L  was  increased  or  decreased  from  its  current  value  if  it  deviated  significantly  from  the  set 
value  of  50  times  the  current  message  length.  The  deviation  was  expressed  in  terms  of  the  dimensionless 
quotient 

Q  =  L/(50-a)-l  ,  0) 

where  a  is  the  average  message  length.  The  correction  of  L^id  was  expressed  by  a  factor  X  as  follows: 

Lnew  =  Loid  •  (1  +  X)  .  ^  (2) 

The  control  rules  for  the  computation  of  the  corrector  X  for  given  deviation  Q  are  summarized  in  the  following 
fuzzy-rule  table. 


Q 

NL 

N 

Z 

P 

PL 

X 

PL 

P 

Z 

N 

NL 

Here,  NL,  N,  Z,  P,  and  PL  denote  fuzzy  sets  that  define  the  categories  "negative  large,"  "negative,"  "zero," 
"positive,"  and  "positive  large,"  respectively.  The  first  rule  in  the  rule  table  is 

If  Q  is  "negative  large,"  then  make  X  "positive  large." 

The  other  four  table  entries  denote  corresponding  rules.  The  membership  functions  of  the  fuzzy  sets  that 
describe  the  categories  were  determined  by  numerical  experiments  with  the  battlefield  network  model. 

CONTROL  OF  ACCESS 

The  access  to  the  radio  channel  is  regulated  by  the  size  of  the  access  delay  interval  D.  In  principle,  the  size 
of  D  can  be  efficiently  controlled  by  the  simple  algorithm 

Dnew  '=  Doid  •  (1  -t-  8)  ,  (3) 

where  8  is  an  output  of  the  control  rules.  In  practice,  this  simple  procedure  must  be  modified  to  ensure  that  after 
an  initialization  time,  all  independent  controllers  indeed  produce  the  same  value  of  D,  even  when  starting  from 
different  initial  conditions.  For  clarity,  we  shall  first  describe  the  control  rules  in  the  context  of  Eq.  (3)  and 
discuss  the  modifications  of  the  algorithm  later. 

Let  c  -  i  be  the  difference  between  the  relative  uses  of  channel  time  for  colliding  messages  and  for  idling, 
respectively,  during  the  listening  interval  L.  A  common-sense  rule  for  the  governing  of  D  is  "increase  D  if 
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Q  —  1  >  0  and  decrease  D  if  c  —  i  <  0.”  The  rule  table  is  as  follows. 


c  -  i 

ML 

N 

Z 

P 

PL 

6 

NL 

■  N 

Z 

P 

PL 

This  set  of  rules  turns  out  to  be  very  effective,  but  it  does  not  account  for  the  possibly  different  message  lengths. 
Thus,  a  single  collision  involving  a  very  long  message  can  increase  the  average  collision  time,  c,  as  much  as 
many  collisions  involving  short  messages,  but  in  each  of  theses  cases,  the  proper  control  strategy  is  different. 
Therefore,  as  a  second  input,  the  trend  of  the  number  of  colliding  accesses  was  used.  Let  Xc  be  the  observed 
trend.  Then  the  rule  table  is  as  follows. 


“Cc 

NL 

N 

Z 

P 

PL 

5 

NL 

N 

Z 

P 

PL 

To  make  the  control  more  responsive  to  extreme  conditions,  two  more  inputs  were  considered:  "idle  time  over  a 
threshold"  and  "collision  time  over  a  threshold."  The  rule  table  for  the  former  was  as  follows. 


i-ti 

NL 

N 

z 

p 

PL 

5 

0 

O 

0 

NL 

NL 

Here  O  indicates  "no  output";  that  is,  no  rule  is  fired  in  these  cases.  These  rules  provide  an  acceleration  of  the 
reduction  of  D  if  i  is  unreasonably  large.  The  corresponding  rule  table  for  the  collision  time  over  a  threshold  is 

as  follows. 


c  -  tc  NL 

N 

Z 

p 

PL  / 

5  0 

0 

0 

PL 

PL 

Experiments  show,  however,  that  the  last  two  sets  of  rules  have  only  a  minor  effect  on  the  performance  of  the 
control  if  used  in  addition  to  the  first  two  sets  of  rules.  If  used  without  the  first  two  sets  of  rules,  then  the 
performance  of  the  control  was  not  as  good  as  with  the  first  two  sets  alone. 

The  outputs  from  the  rules  were  combined  using  the  compositional  rale  of  inference  [3  -  .6],  and  the  fuzzy 
result  was  defuzzified  using  die  center  of  gravity  method  [5].  A  combination  of  rules  in  the  form  of  two- 
dimensional  rule  tables  was  also  tried  and  found  to  perform  equally  well. 

The  described  control  works  efficiently  if  all  nodes  start  with  the  same  initial  value  of  D  and  have  same 
observational  input.  In  practice,  one  can  only  assume  that  the  monitored  inputs  are  approximately  equd  for  all 
nodes  but  the  initial  conditions  can  be  quite  different  because  different  nodes  might  jom  the  network  at  different 
times.’  Therefore,  the  updating  formula  (3)  for  D  must  be  modified  such  that  after  an  operation  over  a  reasonable 
time  (several  minutes),  differences  among  the  nodes  become  negligible.  One  method  to  achieve  this  is  to  replace 
Eq.  (3)  by  an  absolute  output  instead  of  the  corrective  one.  However,  experiments  indicate  that  such  a  control  is 
not  efficient  because  absolute  outputs  cannot  be  easily  adapted  to  changing  network  conditions.  We  now 
describe  a  replacement  of  Eq.  (3)  that  was  found  satisfactory. 

First,  we  separate  in  Eq.  (3)  the  dimensional  factor  D  from  the  nondimensional  corrector  F  =  1  -t-5,  and 
devise  for  F  an  updating  procedure  such  that  the  factor  drifts  with  time  to  a  fixed  value  that  is  independent  of  its 
initial  value.  Let  At  [s]  be  the  difference  between  the  current  and  previous  times  of  network  parameter  readings. 
(Network  parameters  are  updated  at  discrete  times  after  the  end  of  each  activity  -  that  is,  at  the  end  of  eac 
transmitted  message  -  because  only  at  those  instants  the  average  usage  times  and  access  numbers  can  be 
computed.)  Let  f  be  a  fixed  value  of  F  to  which  the  correction  factor  should  drift  with  increasing  time.  Let 
e=exp(- At/60).  Then  the  factor  is  updated  as  follows: 
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(4) 


Fnew  =  [  f  •  (1  -  e)  +  Fold  •  e  ]  •  (1  +  5)  . 

One  can  show  that  the  exponential  factor  in  the  formula  has  the  effect  that  the  starting  value  of  F  becomes 
insignificant  after  a  few  minutes.  Hence,  if  all  nodes  would  use  the  same  f,  then  equalization  among  the  nodes 
would  result.  Experiments  show,  however,  that  an  a  priori  assignment  of  a  unique  value  of  f  for  all  conditions  is 
not  desirable.  Rather,  to  have  efficient  access  control,  its  value  should  be  allowed  to  vary  typically  between 
0.001  and  about  80.  We  achieve  equalization  of  f  by  assigning  to  it  only  a  limited  number  of  discrete  values 
within  this  range.  In  particular,  the  access  controllers  assign  to  f  values  in  steps  of  four  according  to  the 
following  algorithm: 

If  Sold  >0  and  Snew  >  0.  increase  f; 

If  Sold  <0  and  Snew  <  0.  then  decrease  f. 

Equalization  of  the  factor  f  among  the  nodes  occurs  in  short  time  because  typically  the  control  output  S 
continuously  increases  or  decreases  over  more  than  20  readings. 

A  reasonable  value  for  the  dimensional  factor  Doid  is,  for  instance,  the  average  idle  time.  However,  that 
average  increases  when  D  is  increased  and  vice  versa,  thereby  accelerating  any  corrections  by  the  control.  In 
extreme  conditions,  this  results  in  a  drift  of  D  to  zero  or  infinity.  We  avoid  such  excursions  by  using  the 

logarithm  of  the  average  idle  time  instead  of  the  average  itself.  The  average  idle  time  is  also  subject  to  smaller 

oscillations  between  readings  that  are  not  conducive  to  access  control.  These  oscillations  were  reduced  by  using 
a  weighted  average  of  the  current  and  previous  reading  of  the  average  idle  time.  The  weighting  was  again  done 
wiffi  an  exponential  factor  that  reduces  the  infiuence  of  old  readings.  Let  V  =  exp(-At/300).  Then  the  effective 
value  of  the  average  idle  time  is  computed  by 

iefif  ~  (iold  ’  inew)  /  (^  4-  1 )  .  (^) 

The  final  formula  for  the  access  delay  time  that  replaces  Eq.  (3)  is  with  these  modifications  ^ 

Dnew  =  [  0-01  +  log(l  +  ieff/  4)  ]  •  Fnew  • 


GRANULATION 

In  all  sets  of  rules,  five  categories  of  input  and  output  were  found  to  be  sufficient.  The  membership  functions 
that  define  the  categories  were  assumed  to  be  trapezoids.  The  utility  programs  that  were  developed  and  used  for 
mle  combinations  and  defuzzifications  accept  arbitrary  forms  of  membership  functions,  but  it  was  found  that 
more  general  functions  ate  not  needed  to  achieve  optimal  control.  Numerous  experiments  were  earned  out  to 
determine  optimal  definitions  of  input/output  categories  under  different  conditions.  Results  of  these  experiments 
indicate  that  the  best  membership  functions  are  robust  in  the  sense  that  similar  functions  perform  well  for  a  wide 
variety  of  networks  and  that  the  corresponding  minima  are  flat.  If  the  control  should  be  implemented  in  field 
equipment,  then  special  fuzzy-logic  chips  would  be  used  for  the  fuzzy-logic  operations.  The  results  of  the 
numerical  experiments  will  provide  guidance  for  fire  design  of  such  chips. 

Figure  1  shows  the  optimal  categories  for  the  rules  involving  the  difference  c  -  i  and  the  trend  Tc,  and 
Figure  2  shows  the  output  categories  of  the  control  parameter  6.  It  is  interesting  to  note  slight  asymmetries  in 
the  ,  granulations  of  the  inputs.  These  asymmetries  were  not  chosen  arbitrarily  but  developed  by  the  tuning 
process  of  the  granulations.  One  of  the  effects  of  the  asymmetries  is  that,  in  general,  the  collision  time  is  kept  at 
a  fraction  of  the  idle  time. 


EXAMPLE 

Let  the  network  consist  of  n  nodes,  and  let  us  assume  that  the  message  generation  rate  is  sufficiently  high  so 
that  all  nodes  accumulate  queues  of  messages  for  broadcasting.  Further  let  the  message  generation  stop  at  a 
preset  time.  Then  a  good  measure  for  the  performance  of  access  control  is  the  time  that  is  needed  to  empty  all 
queues.  Using  different  control  strategies  with  the  same  set  of  message  queues,  one  can  compare  the  control 
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Figure  1.  Input  categories. 


Figure  2.  Granulation  of  control  output. 

performances  by  comparing  the  corresponding  completion  times.  We  present  an  example  of  a  network  with  four 
nodes.  The  information  content,  measured  by  the  total  length  of  all  messages  in  a  queue,  was  approximately 
equal  for  ail  nodes,  but  the  average  lengths  of  messages  were  difiFerent  for  different  nodes.  In  Figure  3,  the 
results  of  an  experiment  with  fixed  access  delay  time  are  compared  with  the  results  with  controlled  delay  time. 
The  left-hand  graph  shows  curves  of  network  usage  times  for  a  fixed  access  delay  time  D  of  13  seconds.  All 
queues  are  emptied  in  about  34  minutes.  The  graph  shows  that  during  that  time  the  channel  has  been  used  for 
message  transmissions  about  17  minutes  (dashed  curve),  for  idling  about  11  minutes  (solid  curve),  and  for 
colliding  messages  about  6  minutes  (dotted  curve).  It  is  obvious  that  during  the  first  20  minutes  the  access  delay 
time  interval  D  was  too  short,  causing  many  message  collisions.  On  the  other  hand,  during  the  last  10  minutes 
of  the  experiment,  there  were  no  new  collisions  (the  dotted  curve  is  flat),  and  a  smaller  D  could  be  used  to  reduce 
the  idle  time.  The  right-hand  graph  shows  the  same  network  with  a  controlled  delay  time  interval  D.  We 
observe  that  the  dynamically  adjusted  D  equalizes  the  collision  and  idle  times  and  achieves  a  shorter  completion 
time.  In  this  experiment,  the  dynamically  adjusted  access  delay  time  D  was  varied  between  less  than  1  s  and 
about  65  s. 


CONCLUSION 

The  distributed  control  of  battlefield  communications  has  been  found  efficient,  and  the  optimal  parameters  of 
the  investigated  control  algorithm  were  found  to  vary  little  under  very  different  conditions.  This  result  makes  the 
design  of  an  efiicient  general  fuzzy-logic  control  of  battlefield  communications  possible. 
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Figure  3.  Network  usage  with  fixed  and  controlled  access. 
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ABSTRACT 

Real-time  simulation  of  particle  behaviors  is  very  useful  in  training,  education,  art,  advertising,  and 
entertainment.  There  is  no  successful  model  for  realistic  dust  behaviors  generated  by  a  traveling  vehicle.  In  this 
paper,  we  use  particle  systems  and  behavioral  simulation  techniques  to  simulate  these  dust  behaviors  in  real-time. 
First  we  analyze  the  forces  and  factors  which  affect  the  dust  generation  and  the  behaviors  after  dust  particles  are 
generated.  Then  we  construct  physically-based  empirical  models  to  generate  dust  particles  and  control  the  behaviors 
accordingly.  After  that,  we  further  simplify  the  numerical  calculations  by  dividing  the  dust  behaviors  into  three 
stages,  and  establishing  simplified  particle  system  models  for  each  stages.  We  employ  motion  blur,  particle 
blending,  texture  mapping,  and  other  computer  graphics  techniques  to  achieve  the  final  results.  Our  major 
contribution  includes  analyzing  dust  behaviors  in  detail,  constructing  physically-based  empirical  models  that 
correlate  the  behaviors  to  the  dust  generating  forces  and  other  factors,  and  that  achieve  simulations  in  real-time. 


INTRODUCTION 

In  many  virtual  environments  and  distributed  interactive  simulations,  it  is  desirable  to  simulate  tmcks,  tanks, 
armored  vehicles,  bulldozers,  and  other  ground  moving  objects.  However,  typically  dust  behaviors  are  not 
generated  when  these  objects  travel  on  an  unpaved  road.  Dust  behaviors  caused  by  different  factors  (such  as  natural 
wind  and  a  fast  traveling  vehicle)  appear  everywhere.  Simulating  physically  realistic,  complex  dust  behaviors  is 
very  useful  in  interactive  graphics  applications,  such  as  computer  art,  advertising,  education,  entertainment,  and 
training.  However,  due  to  the  lack  of  modeling  and  simulation  techniques  and  methodologies,  there  is  currently  no 
successful  real-time  simulation  for  realistic  dust  behaviors.  As  computers  and  their  graphics  systems  become  much 
faster  and  more  powerful,  many  natural  phenomena  (such  as  the  behaviors  of  fluids,  tertains,  trees,  fireworks, 
volcanos,  clouds,  etc.)  are  simulated  in  real-time  [1,  6,  5,  7,  8,  10,  11,  12,  13].  We  believe  it  is  appropriate  now  to 
include  dust  behaviors  into  real-time  simulation. 


Hsu  and  Wong  [3]  introduced  a  dust  accumulation  model.  Their  model  presents  static  appearance  of  dust 
accumulation  without  behavior  and  animation.  Cowherd,  Williams,  and  other  researchers  [2,  14]  studied  dust  and 
the  mechanisms  of  dust  generation.  Their  purpose  was  to  study  and  measure  the  density  of  the  dust  in  the  real 
battlefield  instead  of  simulating  the  dust  behaviors  in  graphics.  Today,  military  training  in  graphics  and  distributed 
interactive  simulation  is  one  of  the  major  topics  for  research  and  applications  [4],  and  generating  dust  behaviors  in 
real-time  significantly  increases  the  realism  of  the  simulated  training  environment. 


In  this  paper,  we  introduce  a  method  for  simulating  the  dust  behaviors  caused  by  a  fast  traveling  vehicle  in  real¬ 
time.  The  method  is  a  combination  of  particle  systems  and  behavioral  simulation  techniques.  The  Particle  ^sterns 
technique  was  first  introduced  to  computer  graphics  by  Reeves  [8]  in  1983,  and  is  now  widely  used  to  simulate 
fuzzy  or  dynamic  objects,  such  as  fire,  grass,  explosions,  clouds,  water,  trees,  etc.  These  objects  have  no  fixed  shape 
and  sometimes  change  their  shapes  or  behaviors  stochastically.  They  have  ill-defined  boundaries  that  make  surface- 
based  modeling  impractical.  It  is  apparent  that  dust  behaviors  behind  a  moving  vehicle  belong  to  this  category.  The 
Behavioral  simulation  technique  uses  a  physically-based  modeling  method  to  calculate  and  update  the  object  s 
state,  and  draw  the  object  repetitively  after  each  calculations  to  achieve  the  behavior  animation  in  real-time.  We 
also  employ  motion  blur  for  small  and  fast  moving  particles,  particle  blending  instead  of  hidden-surface  removal, 
texture  mapping,  and  other  graphics  techniques  to  achueve  better  performance  and  appearance  of  the  final  results. 


1 .  Approved  for  public  release;  distribution  is  unlimited 

The  work  of  Dr.  Wegman  was  supported  by  the  Army  Research  Office  under  contract  DAAH04-94-G-0267  and  by  a  National  Science  Foundation 
Group  Infrastructure  Grant  DMS-963 1351.  This  work  was  completed  while  Dr.  Wegman  was  a  Navy-ASEE  Distinguished  Faculty  Fellow  at  the  Naval 
Surface  Warfare  Center,  Dahlgren. 


48 


In  order  to  build  up  a  physically-based  realistic  simulation,  we  first  analyze  the  forces  and  factors  which  affect 
the  dust  generation  and  the  behaviors  after  dust  particles  are  generated,  and  then  construct  physically-based 
empirical  models  to  generate  dust  particles  and  control  the  behaviors  accordingly.  However,  the  models  are  ome- 
consuming  and  inefficient.  Therefore,  based  on  the  models  and  analysis  of  the  forces,  we  further  simplify  the 
numerical  calculations  by  dividing  the  dust  behaviors  into  three  stages,  and  establishing  simplified  particle  system 
models  for  each  stages.  The  resulting  models  are  satisfactory  for  real-time  simulation  as  well  as  achieving  realistic 
dust  behaviors.  Our  major  contribution  includes  analyzing  dust  behaviors  in  detail,  constructing  physically-based 
empirical  models  that  correlate  the  behaviors  to  the  dust  generating  forces  and  other  factors,  and  that  achieve 
simulations  in  real-time.  Our  work  is  a  useful  addition  to  many  applications  in  simulated  virtual  environments, 
including  military  simulation  and  training. 


DUST  BEHAVIORS 

In  this  section,  we  first  discuss  and  analyze  how  dust  particles  are  generated;  then  we  inttoduce  the  factors 
which  affect  the  dust  behaviors,  after  that  we  analyze  the  forces  acting  on  a  dust  particle  and  establish 
corresponding  physically-based  empirical  models  to  calculate  and  update  the  dust  behaviors. 


DUST  GENERATION 

As  a  vehicle  wheel  passes  over  an  unpaved  surface,  three  basic  forces  are  developed  -  vertical  pressure, 
horizontal  stress,  and  friction.  Vertical  pressure,  which  is  due  to  the  weight  of  the  vehicle  will  produce 

ground  surface  vibration  and/or  deformation,  crushing  large  particles  into  srnaller  ones.  Horizontal  stress  and 
friction,  which  are  largely  due  to  the  driving  power  which  sustains  the  velocity  and  acceleration  of  the 

vehicle,  will  further  comminute  the  particles  and  carry  them  on  the  surface  of  the  tire.  The  slippage  between  the  tire 
and  the  ground  surface  will  lift  particles  of  different  sizes  due  to  the  adhesive  and  shear  forces,  and  eject  them  at 
different  places  on  the  tire  surface  due  to  the  centrifugal  forces. 

The  ground  vibration  and/or  deformation  will  also  eject  fine  particles.  The  dust  particles  are  then  either 
entrained  in  the  turbulent  air  behind  the  vehicle  or  return  to  the  ground  depending  on  their  properties  and 
conditions.  Bigger  particles  will  fall  back  to  the  ground  surface  more  rapidly  while  the  fine  ones  will  remain 
suspended  in  the  air  drifting  with  the  current.  Small  stones  and  blocks  of  muds  will  fall  back  to  the  ground 
immediately  after  ejection  from  the  tires,  and  will  bounce  up  and  down,  also  generating  dust  into  the  air. 


Figure  1:  Dust  generation  process 

There  are  many  other  important  factors  that  affect  dust  generation,  such  as  the  material  and  composition  of  the 
ground  surface  (Matp),  the  size  and  properties  of  the  vehicle  and  the  conditions  of  the  environment 


The  material  and  composition  of  the  ground  surface  (Matp)  depend  on  the  density  (Dp)  and  moisture  (MOp)  of 
the  ground  surface,  the  average  size  (5^),  mass,  and  adhesion  of  each  individual  dust  particle.  If  the  ground  is  wet 
and  the  average  size  of  the  particles  is  large,  then  there  will  be  fewer  particles.  If  the  dust  density  of  the  ground 
surface  is  high,  then  there  will  be  more  particles.  We  use  the  following  equation  to  measure  this  parameter; 


Matp 


D 


p 


a^Sp-\-a2MOp 


(1) 
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where  ^  =  1 ,  and  a,-  is  the  weight  coefficient  of  the  corresponding  parameter,  for  i  =  1,2.  The  size  and 

properties  of  the  vehicle  (S^ar)  depend  on  the  weight  (WT^^r)^  height  width  (W^ar)  f^e  vehicle. 

Heavier  and  bigger  vehicle  will  generate  more  dust  particles.  We  use  the  following  equation: 

=  Pi  •  +  P2  ‘  f^car  +  p3  *  (2) 


3 

where  P/  =  1 ,  and  is  the  weight  coefficient  of  the  corresponding  parameter,  for  i  =  1,  2,  3.  The 

conditions  of  the  environment  (F^^v)  including  humidity,  air  pressure,  and  many  other  environment  damping  forces 
will  also  affect  the  dust  generation. 


In  summary,  most  dust  particles  are  generated  right  behind  the  wheels.  Some  fine  dust  particles  ^e  lifted  from 
the  ground  surface  because  of  the  turbulent  wake  behind  the  vehicle.  Each  particle  is  generated  with  its  own  initial 
mass,  size,  and  velocity.  The  number  of  dust  particles  generated  by  the  vehicle  for  each  simulating  frame  is 
calculated  as  follows: 


..  \Vcar\(yi'Mat^^y2'S,ar) 

\Fen.\ 


where  =  1  ^  and  is  the  weight  coefficient  of  the  corresponding  parameter,  for  f  =  1,  2.  Once  the 

particles  are  entrained  in  the  turbulent  wakes  behind  the  vehicle,  their  behaviors  are  affected  by  similar  factors 
which  will  be  discuss  in  the  next  section. 


FACTORS  AFFECTING  DUST  BEHAVIORS 

There  are  numerous  factors  that  affect  the  dust  behaviors  caused  by  a  fast  traveling  vehicle.  Here  we 
summarize  the  important  factors  that  have  more  serious  effects  on  the  dust  behaviors. 


When  a  vehicle  travels  quickly,  it  produces  a  3D  volume  behind  where  the  atmosphere  pressure  is  lower  than 
that  of  the  other  areas.  The  turbulent  wake  is  mostly  inside  this  volume  and  generates  vigorous  dust  behaviors.  The 
shape  and  size  of  the  3D  volume  are  mainly  decided  by  the  velocity,  height,  and  width  of  the  vehicle.  At  the  same 
time,  the  properties  of  each  individual  dust  particle  will  affect  its  own  behaviors.  These  properties  include  the 
shape,  size,  mass,  and  initial  conditions.  Intuitively,  a  particle  will  fall  back  to  the  ground  faster  if  it  is  heavier  and 
smaller.  In  addition,  the  humidities,  wind,  and  environmental  damping  forces  have  an  effect  on  the  dust  behaviors 
also.  We  summarize  the  parameters  which  have  the  greatest  influence  on  the  dust  behaviors  in  table  1 


Table  1:  Parameters  affecting  the  dust  behaviors 


Items 

Parameters 

Description 

Vehicle 

Velocity  — 

Height  —  Hear;  Width  — 

Decide  the  size  of  the  3D  volume  behind  a  car  where  the  tur¬ 
bulent  wake  is  generated. 

Dust  particle 

Size  —  ,  Mass  —  nip 

Affect  how  an  individual  particle  will  react  to  the  external 
forces 

Environ¬ 

ment 

Wind  —  Vair^  Moisture  — 

MOair 

Influence  the  dust  behaviors  in  general. 
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DYNAMICS  OF  A  DUST  PARTICLE 


Computational  fluid  dynamics  could  be  used  to  calculate  the  exact  turbulent  wake  behind  the  vehicle. 
However,  this  approach  is  computationally  complex  and  prohibits  achieving  simulation  in  real-time.  In  order  to 
describe  the  dust  behaviors  caused  by  a  fast  traveling  vehicle,  here  we  simplify  the  dynamics  of  a  dust  particle, 
analyze  all  the  important  forces,  correlate  these  forces  with  the  parameters  affecting  the  dust  behaviors,  and 
establish  analytical  models  which  can  be  used  to  simulate  the  dust  behaviors  in  real-time.  Here  we  flrst  analyze  the 
forces  behind  a  traveling  vehicle,  then  we  study  the  effects  of  these  forces  on  an  individual  particle. 


As  the  vehicle  travels  forward  quickly,  it  produces  a  3D  volume  behind  the  vehicle  where  the  atmosphere 
pressure  is  smaller  than  that  of  the  other  areas.  To  simplify  the  situation  for  our  analysis,  let  s  assume  that  the 
vehicle  does  not  turn,  and  the  area  affected  by  the  vehicle  is  a  box.  That  is,  our  particle  systems  range  is  a  box 
moving  at  the  speed  of  the  vehicle.  The  box’s  height  and  width  are  the  same  as  those  of  the  vehicle  s  (Hcar> 
and  its  length  (L)  depends  on  the  vehicle’s  velocity.  Because  of  the  fast  movement  of  the  vehicle,  different  places 
within  the  box  have  different  atmosphere  pressures.  The  differences  among  the  pressures  will  generate  turbulent 
wakes,  and  the  dust  behaviors  accordingly  (Fig.  2). 


Fpret  pressure  (from  top) 


•  The  pressure  or  attraction  force  towards  the  rear  side  of  the  vehicle 

•  The  atmosphere  pressure  force  Fp^^s  from  the  two  sides  of  the  box  area 

•  The  atmosphere  pressure  force  Fp^g^  from  the  top  and  bottom  of  the  box  area 

•  The  atmosphere  damping  force  against  the  particles  relative  movement 

•  The  dust  particle’s  gravity  F^ 

F^j,  is  a  function  of  the  vehicle’s  velocity  the  size  of  the  vehicle  distance  between  the 

dust  particle  and  the  rear  side  of  the  vehicle  (/).  As  the  vehicle  moves  ahead,  /  becomes  larger,  and  therefore  Fg^  is 
reduced  rapidly.  Its  direction  is  approximated  by  the  direction  which  the  vehicle  travels.  We  have  the  following 
equation: 


F 


attr  ~~ 


(4) 


where  ^  is  a  constant  scale  parameter.  Fp„g  is  a  function  of  the  vehicle’s  velocity  (V^ar)>  distance  between 
the  dust  particle  and  the  rear  side  of  the  vehicle  (/),  and  the  distance  between  the  particle  and  the  horizontal  cross 
section  in  the  center  of  the  box  (4).  As  the  vehicle  moves  ahead,  /  becomes  larger,  and  therefore  Fp^s  is  reduced 
rapidly.  Its  direction  is  towards  the  center  cross  section  and  parallel  to  the  ground,  which  is  approximated  by  the 
direction  of  4-  We  have  the  following  equation: 
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= 


■d 


(5) 


where  is  a  constant  scale  parameter.  is  similar  to  ^  function  of  the  vehicle’s  velocity 

the  distance  between  the  dust  particle  and  the  rear  side  of  the  vehicle  (/),  and  the  distance  between  the  particle  and 
the  horizontal  cross  section  in  the  center  of  the  box  {d^.  Its  direction  is  towards  the  center  cross  section  and 
perpendicular  to  the  ground,  which  is  approximated  by  the  direction  of  We  have  the  following  equation: 


F 


pret 


(6) 


where  is  a  constant  scale  parameter.  is  a  function  of  the  particle’s  velocity  (V^),  the  particle’s  size  (Sp), 
the  environment  wind  moisture  Its  direction  is  against  the  particle’s  movement  in 

opposite  to  Vp,  We  have  the  following  equation: 


Pair  =  + 


(7) 


where  6| ,  62 ,  T|j ,  and  r|2  are  constant  parameters. 


In  summary,  all  the  forces  acting  on  a  particle  are  shown  in  Fig.  3.  Here  we  ignore  the  collisions  among  the 
dust  particles. 


Figure  3:  Forces  acting  on  a  dust  particle 

Let  Fp  be  the  force  acting  on  a  dust  particle,  P  the  position,  Vp  the  velocity,  Ap  the  acceleration,  and  nip  the 
mass  of  the  particle.  Then  a  dust  particle’s  behavior  is  described  by  the  following  equations: 


F=F  ^  F  +F  ..  +  F  +F- 

p  ^  attr  ~  ■*  pres  ~  ^  pret  ~  ^  grv  ^  ■*  air 

A 

'  ”, 

V,  -  VoAf'A^A, 

To  simplify  the  calculation,  we  use  Euler's  method  to  approximate  the  particle's  next  state: 


(8) 

(9) 

(10) 

(11) 


1"/=  ^/-i+Ap-Ar 


(12) 
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Pi  =  Pi-i  +  Vi-At 


(13) 


The  algorithm  to  compute  the  solution  to  the  dust  behaviors  then  is  as  follows.  For  the  known  current  state  of  a 
particle  {V,-.;,  P,.;},  the  next  state  {V,-.;,  P,./},  after  At  time,  is  calculated  by  Equation  (12)  and  (13).  These 
equations  use  functions  (4)  to  (9).  Equation  (1)  to  (3)  are  used  to  generate  a  number  of  dust  particles.  We  have 
frames  of  the  dust  behaviors  as  in  Fig.  4.  Changing  the  parameters  and  conditions,  we  can  achieve  difterent 
behaviors  and  appearances  to  suit  the  needs  of  the  applications.  The  simulation  is  at  jboutjji’anies^per^econd. 


Figure  4:  A  frame  of  dust  simulation 


SIMPLIFIED  DUST  PARTICLE  SYSTEMS 

The  above  physically-based  dust  particle  model  is  time-consuming  for  a  large  number  of  particles.  There  are 
too  many  factors  in  the  equations,  and  the  forces  on  each  individual  particle  have  to  be  calculated  during  the  whole 
simulation  period.  There  are  redundant  calculations  because  when  the  particles  are  further  away  from  the  vehicle, 
the  forces  ^pres  and  are  all  reduced  to  near  zero.  Based  on  the  above  analysis  (equation  4-7),  we  can 
divide  a  dust  particle’s  beha^ors  into  three  stages  to  further  simplify  the  simulation.  We  consider  that  we  have 
three  different  particle  systems  (models)  working  together  to  simulate  the  dust  behaviors.  The  three  stages  are 
called  turbulent  vortex,  inertial  momentum,  and  airborne  drift,  respectively. 

TT  TRRT  n  .F.NT  VORTEX  (THE  FIRST  STAGE) 

Once  a  particle  is  generated,  the  initial  forces  Fp„j,  and  Fp„/  acting  on  it  are  very  large  (see  Fig.  2,  Fig. 
3,  and  equation  4-7).  Fattr  causes  the  particle  to  move  in  the  forward  direction,  Fp^gt  causes  the  particle  to  move  up 
and  down,  and  F^^s  causes  the  particle  to  move  left  and  right.  All  other  forces  are  relatively  small  at  this  time. 
Suppose  the  particle  is  located  in  side  1  of  the  box  area,  because  the  particle  has  a  side  pressure  force  Fp^g^  pointing 
towards  side  2,  the  particle  will  move  from  side  1  to  side  2.  Once  the  dust  particle  goes  across  the  center  section 
into  side  2,  F_„s  will  change  its  direction,  and  the  particle  will  accelerate  and  move  back  from  side  2  to  side  1.  It  is 
similar  for  Fpj-gf  but  in  a  perpendicular  direction.  At  this  stage,  the  forces  acting  on  the  dust  particles  are  relatively 
strong.  Overall,  depending  on  the  initial  velocity,  the  particle  will  behave  as  in  a  turbulent  vortex  shown  in  Fig.  5. 

FattnFpres^^  Fpret^O  reduced 


Figure  5:  Dust  particle  traveling  trace  (the  first  step) 

Instead  of  calculating  all  the  forces,  we  can  simplify  the  model  by  just  simulating  the  turbulent  vortex  wiA 
some  random  behaviors  at  this  stage.  Particles  are  rotating  around  the  center  of  the  vortex.  Here  we  assume  the 
center  of  the  vortex  is  the  center  of  the  cross  section  in  the  box  area  (Fig.  2).  Its  height  is  .  The  angle  of 

rotation  (RotAngle)  for  each  frame  of  simulation  can  be  calculated  by  the  following  simplified  equation: 
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(14) 


RotAngle  =  — rrr-j- - 

^^^ightness 

where  J?  =  +  dj,  which  is  the  distance  from  the  particle  to  the  center  of  rotation  (we  use  bold  to  represent 

vectors).  Its  initial  value  is  a  function  of  Np  (equation  3).  We  use  the  following  equation  to  adjust  d  for  each 
simulation  frame: 


\R\.  =  |i?|-_,-(l+Ar)  (15) 

Ar  =  dR  +  k^'  ^  myRandom{-^,  0,  C5)  (16) 

where  dR  is  a  constant  related  to  the  initial  velocity  towards  the  center  of  the  rotation,  kj  is  a  constant 
coefficient,  nip  is  the  mass  of  the  particle,  Sp  is  the  size  of  the  particle,  function  MyRandomiz.,  b,  c)  returns  a  random 
pick  of  the  numbers  enclosed,  and  G5  is  a  comprehensive  weight  parameter  reflecting  the  vehicle’s  properties: 


03  = 


^car+y  ca\ 


(17) 


2 

where  2,  ~  ^  weight  coefficient  of  the  corresponding  parameter,  for  i  =  1,  2;  0  <  GS  <  1  .  The 

value  Tightness  is  usually  between  1.0  and  2.0.  Higher  Tightness  causes  RotAngle  to  fade  away  more  quickly  when 
the  distance  becomes  larger.  RotD  is  the  distance  the  particle  traveled  at  each  time  frame  around  the  vortex  circle, 
perpendicular  to  the  center  of  vortex.  We  use  the  follow  equation  to  calculate  this  value: 


RotDi  =  RotDi_^  •  (1  +  AD)  (18) 

AD  =  dD  +  /:2  •  •  myRandom{-X3^,  0,  03)  (19) 

where  dD  is  a  constant  related  to  the  initial  velocity  perpendicular  to  the  center  of  the  rotation,  and  k2  is  small 
constant.  A  dust  particle  may  be  below  ground  as  its  rotating  radius  increases.  In  our  simulation,  we  just  remove 
those  particles  which  hit  the  ground. 


The  particle’s  translation  distance  (TransD)  along  the  vortex  axis  is  calculated  as  follows: 

TransD^  =  TransD ■  (1  +  A7D)  (20) 

ATD  =  dTD  +  ^^3  •  ^  ’  myRandom{0,  03)  (21) 

where  dTD  is  a  constant  related  to  the  initial  velocity  parallel  to  the  center  of  the  rotation,  and  is  a  small 
constant. 

TNFRTIAL  MOMENTUM  TTHE  SECOND  STAGE) 

As  the  vehicle  travels  and  time  passes  by,  the  forces  and  reduce  rapidly,  and  finally  disappear. 

At  this  moment,  the  particle  will  continue  its  movement  at  its  current  momentum.  The  forces  F^  and  F^^  are  the 
primai7  forces  governing  its  behavior  (acceleration.)  We  call  this  stage  the  Inertial  Momentum  stage.  This  stage 
will  continue  until  dust  particle’s  velocity  is  reduced  such  that  the  particle’s  velocity  becomes  a  small  constant. 
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Every  dust  particle  has  a  lifetime  parameter,  which  is  used  to  decide  when  its  motion  enters  from  the  first  stage 
into  the  second  stage,  and  from  the  second  to  the  third.  We  use  the  following  equation  to  simulate  the  dust  particle 
movement  at  this  stage.  The  initial  velocity  is  calculated  from  the  first  stage  s  values. 


A,  =  A(/np,Sp,  V,._j) 

(22) 

V,.  =  V,_,+A,-Ar 

(23) 

P.  =  P,.,  +  V,-Af 

(24) 

where  A  is  a  function  to  calculate  the  acceleration.  Here  we  use  A(m,  S,V)  -  —  .  When  A/  becomes  very 
small,  we  get  into  the  drifting  stage,  as  discussed  in  the  next  section. 


AIRBORNE  DRIFT  (THE  THIRD  STAGm 

When  the  total  force  on  a  particle  becomes  very  small,  the  dust  particle  begins  to  drift  with  constant  velocity. 
The  forces  acting  on  the  particle  are  balanced.  Most  of  the  dust  particles  will  drift  with  the  wind  and  evenmally  tall 
back  to  the  ground.  In  this  stage,  the  particles  stay  in  the  air  are  those  with  very  small  mass  quantities.  We  simply 
keep  the  constant  velocity  with  some  random  disturbances.  If  a  dust  particle  touches  the  ground,  it  is  dead.  If  a  dust 
particle  drifts  from  the  range  of  the  particle  systems  to  the  outside  area,  it  is  considered  dead.  Dead  particles  are 
faded  away  after  a  few  frames  of  simulations. 


RENDERING  TECHNIQUES  AND  RESULTS 


MOTION  BLUR 

We  use  motion  blur  to  achieve  better  animation.  We  record  every  dust  particle  s  several  continuous  positions. 
Each  dust  particle  has  a  head  pointer  which  is  the  current  position,  and  a  tail  pointer  which  is  the  fading  posiUon.  A 
particle  is  drawn  a  number  of  times  into  the  buffer  with  bigger  and  bigger  fading  coefficients.  The  head  is  drawn  3t 
its  current  position  with  the  particle’s  original  color,  and  the  tml  is  drawn  at  the  earliest  position  with  a  much 
dimmer  color.  We  let  these  smaller  particles  to  have  more  blurring  effect.  That  is,  the  smaller  the  dust  particle  is, 
the  longer  blur  process  will  be.  This  simple  technique  seems  to  make  the  simulation  more  realistic.  We  also  use  the 

comprehensive  parameter  G3  (equation  17)  to  control  the  blurring  process.  Larger  C5  value  causes  particles  having 
more  blurring  effects.  We  can  modify  different  parameter  for  blur  according  to  the  simulation  applicable  for  certain 
application. 

BLENDING 


Dust  particles  can  obscure  other  particles  that  are  behind  them,  or  they  can  be  transparent  and  can  cast  shadows 
on  other  dust  particles.  We  uses  Reeves’s  method  to  deal  with  this  situation.  Every  particle  is  treated  ^  a  point  light 
source  when  it  is  displayed.  Each  particle  adds  a  bit  of  light  to  the  pixels  that  it  covers.  A  particle  behind  another 
particle  is  not  obscured  but  rather  adds  more  light  to  the  pixels  covered. 


In  order  to  speed  up  the  rendering  process,  we  restrict  our  rendering  area  to  be  the  box  area  shown  in  Fig.  2,  As 
the  vehicle  travels  ahead,  the  box  area  moves  ahead  the  same  distance.  Any  dust  particle  outside  the  box  area  is 
treated  as  dead.  This  method  allows  us  to  have  a  background  texture  which  is  not  updated.  So  we  only  need  to 
calculate  and  render  the  dust  particles  within  this  box  area.  This  approach  reduces  the  number  of  dust  particles 
needed  and  also  reduces  the  memory  volume  needed. 


PARTICLES 

For  each  calculated  particle,  we  generate  a  number  of  particles  that  have  movement  similar  to  this  particle,  with 
some  random  behaviors.  This  way  we  only  calculate  one  particle,  but  a  sy stein  of  particles  will  behave  accordingly. 
It  saves  time  for  calculating  all  the  particles  and  enhances  the  richness  of  the  picture  at  the  same  time. 
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RESULTS 


Fig.  6  and  Fig.  7  are  the  simulation  results  under  the  same  circumstances  with  different  vehicle  velocities  (30 
miles/hour  and  60  miles/hour.)  The  number  of  particles  and  the  density  of  the  dust  ^e  greatly  affected  by  the 
vehicle’s  velocities.  Fig.  8  and  Fig.  9  are  the  simulation  results  with  different  dust  densities.  There  are  many  other 
parameter  which  affect  the  simulation.  In  most  cases,  the  simulation  looks  better  with  more  dust  particles,  but  the 
simulation  is  much  slower  because  all  the  particles  must  be  calculated  in  the  particle  systems. 


CONCLUSION  AND  FUTURE  WORK 

We  have  introduced  our  approach  to  simulating  dust  behaviors  behind  a  traveling  vehicle.  We  have  two 
primary  goals:  one  is  realism  of  the  simulation,  the  other  is  real-time  computation.  In  order  to  achieve  realism,  we 
analyze  the  forces  and  factors  and  construct  physically-based  empirical  models  to  generate  particles  and  consol  the 
dust  behaviors  accordingly.  In  order  to  achieve  real-time,  we  further  simplify  the  numerical  calculations  by  dividing 
the  dust  behaviors  into  three  stages,  and  establishing  simplified  particle  system  models  for  each  stages.  We  employ 
motion  blur,  particle  blending,  texture  mapping,  and  other  techniques  in  computer  graphics  to  achieve  better  results. 
Our  work  is  a  useful  addition  to  many  applications  in  simulated  virtual  environments. 


Our  model  is  a  physically-based  empirical  model.  Using  CFD  to  calculate  the  turbulent  vortex  behind  a  vehicle 
would  yield  an  accurate  physical  model,  which  could  be  integrated  into  our  work.  The  problem  is  that  CFD  models 
are  too  computationally  complex.  We  have  succeeded  simulating  fluid  flows  in  real-time  [1].  We  hope  to  find  a 
solution  for  dust  behaviors  also.  Currently  our  vehicle  is  traveling  in  straight  line.  A  logical  extension  will  be  to 
allow  the  vehicle  to  turn  around.  We  are  currently  working  on  this. 


We  plan  to  further  consider  the  interaction  between  the  dust  particles  and  the  environment.  For  example,  when 
the  vehicle  passes  by,  the  grass  on  the  two  sides  of  the  road  will  swing  back  and  forth.  We  plan  also  simulate  the 
dust  accumulation  on  the  grasses. 
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ABSTRACT 

This  paper  discusses  attempts  to  apply  standard  analytical  methodologies  to  three 
complex  systems;  (1)  Field  Experimentation  mechanisms  at  the  Army’s  Combat 
Development  Experimentation  Center  at  Fort  Hunter  Leggett  [1962  -  1967]:  (2)  The 
“Small  Independent  Action  Force”  actmties  in  Vietnam  [1969  -  1971];  and,  (3)  The 
Defense  Acquisition  Process  [1983  -  1994].  The  discussion  below  describes  each  of  the 
analyses  performed  and  the  results  obtained.  The  data  collected  indicates  that  the  key 
element  to  devising  means  of  achieving  better  system  performance  and  solving  problems 
associated  with  such  complex  systems  is  flexibility  to  act  as  required,  rather  than 
construction  of  rigid  rules  and  regulations.  Taken  as  a  body  of  information,  the  work 
suggests  that  in  order  to  understand  large,  complex,  man-machine  systems,  and  to  permit 
definition  of  generalized,  long  term  solutions  to  perceived  problems  we  must  go  beyond 
the  use  of  classical  analytical  methodologies.  It  may  be  that  finding  long  lasting  workable 
solutions  to  problems  may  require  new  analytical  mechanisms  developed  around  the 
concept  of  bounded  change  rather  than  singular  solution.^ 


BACKGROUND 

Albert  Einstein  is  cited  as  the  author  of  the  observation:  “The  thinking  that  got  us 
to  this  point  will  not  get  us  beyond  it!” 

For  those  of  us  who  grew  up  in  a  simpler  time,  the  greatest  utility  in  classical 
statistics  was  that  it  helped  us  predict  the  likely  outcomes  of  events  about  which  we  were 
unable  to  create  deterministic  models:  that  is,  there  was  uncertainty  about  the  problem,  the 
variabilities  which  affected  the  situation,  and,  the  relationships  among  them.  Because 
statistics  had  often  provided  methodology  to  project  outcomes  in  such  situations,  we 
extended  its  use  beyond  such  clear  cut,  well  bounded  events  as  “coin  toss”  and  “dice 
throws”  to  more  complex,  and  less  structured  situations.  Then,  having  invented  multi¬ 
variate  analysis  and  cluster  analysis  methodologies,  we  used  them  (along  with  our  standard 
statistical  techniques)  to  address  more  complicated  equipment  reliability  issues,  and  extend 
our  practice  further  into  the  realm  of  social  system  design. 

We  limited  the  scope  of  such  inquiries  to  some  extent  by  insisting  on  bounding  the 
issues  we  did  address.  As  Willis  Willoughby  (then  head  of  reliability  at  NASA)  said  when 
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he  was  presented  with  an  analysis  of  the  probability  of  sending  a  man  to  the  moon  and 
returning  him  safely,  (of  the  order  of  .027),  “That  problem’s  outside  the  range  of  utility 
for  standard  statistical  methodologies.  -  Let’s  get  on  with  the  job.” 

Today,  “statistical  methods”  are  used  as  analytical  tools  in  examining  extremely 
complex  systems.  The  usual  goal  is  to  attempt  to  predict  “outcomes”  in  situations  where: 
(1)  very  large  numbers  of  variables  are  involved;  (2)  we  are  not  certain  of  the  relationships 
among  those  variables;  and,  (3)  human  decision  making  provides  the  variables  sleeted, 
their  initial  and  subsequent  values. 

The  following  discussion  describes  some  of  the  work  that  leads  to  the  conclusions 
presented  on  page  6  et  seq. 


CASE  HISTORIES 


THE  CDEC  EXPERIENCE. 

In  1962, 1  joined  the  Stanford  Research  Institute’s  CDEC  Research  Office  as  Chief 
of  Instrumentation.  My  task  was  to  devise,  develop,  install,  and  operate  equipment 
capable  of  reporting  the  positions  of  everyone  on  the  battlefield,  and  the  events  they 
experienced  during  experimentation.  When  the  instrumentation  was  in  place,  we  could 
analyze  results  and  reach  conclusions  much  more  expeditiously.  As  we  accumulated  data 
from  a  number  of  field  experiments,  we  noticed  that  there  were  elements  of  maneuver 
which  were  repeated  in  almost  every  one  of  them.  Some  examples  were:  (a)  traversing  a 
well  known  terrain  area  at  different  times  of  the  day  (or  night);  (b)  simulating  an  attack  on 
a  high  ground  defended  in  one  of  several  standard  postures;  or  (c)  sweeping  an  area  to 
detect  targets  and  their  locations.  We  thought  about  creating  a  series  of  “set-piece” 
maneuvers;  activities  which  could  be  performed  repeatedly  on  selected  terrains  within  the 
Hunter-Leggett  Military  Reservation,  independent  of  a  specific  combat  context.  The  idea 
was  to  construct  experimental  segments  present  in  many  of  the  combat  field  activities, 
replicate  them  sufficiently  often  to  generate  statistics  about  expected  task  performances, 
establish  a  data  base  of  “standard  field  performance  values  for  those  combat  tasks,  and  use 
the  “statistical  values”  for  those  tasks  instead  of  actually  performing  them  in  different 
experimental  contexts. 

We  attempted  to  implement  the  idea.  We  planned  a  set  of  experiments  specifically 
for  the  purpose  of  developing  some  performance  statistics  about  how  troops  performed  a 
set  of  specific  tasks  under  controlled  circumstances.  We  selected  five  “standard”  combat 
tasks  and  structured  a  set  of  missions  around  them.  We  built  scenarios  which  required 
each  task  to  be  performed.  But  in  each  scenario  the  order  of  task  performance  was  varied. 
Since  we  did  not  have  an  unlimited  number  of  troops,  we  formed  minimum  building  block 
units  and  constructed  larger  units  by  aggregating  them.  Squads  of  five  soldiers  were  used 
as  the  smallest  building  block.  Platoons  and  Companies  were  built  from  squads.  The  larger 
units  were  asked  to  perform  missions  made  up  of  a  sequence  of  set  piece  combat  tasks. 
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The  scenarios  were  constructed  to  require  task  performance  in  different  sequences. 
Finally,  each  scenario  was  replicated  by  the  various  mixes  of  troops  in  different  organized 
units  a  sufficient  number  of  times  to  provide  a  bounded  value  set  for  each  combat  task. 
Figure  1  shows  the  notional  experimental  matrix  to  be  used  in  establishing  the  data  base. 
In  Figure  1,  the  segment  column  indicate  the  tasks  to  be  performed,  and  the  sequence 
columns  indicate  the  order  of  task  performance. 


Figure  1 

THE  CDEC  “SET  PIECE”  NOTIONAL  MATRIX 
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But  when  experiments  containing  “set-piece”  elements  were  performed  within  a 
combat  context,  the  data  differed  considerably  from  what  had  been  anticipated  as  a 
“standard  data  set”.  It  became  very  clear  that  (a)  mission  context  had  great  effect  on  unit 
performance;  and  (b)  terrain,  mission  context,  and  tactical  assumptions  of  friendly  and 
enemy  force  distributions  formed  sets  of  complex  relationships  which  appeared  to  provide 
unique  situational  results. 

A  contributing  cause  of  result  non-reproducibility  was  thought  to  arise  from  the 
nature  of  human  beings.  They  learn  from  their  experience.  As  Kant  philosophized  [1],  and 
Damasio  later  obser\^ed  experimentally  and  noted  in  his  book  “Descartes’  Error”  [2],  we 
tend  to  see  things  within  the  context  of  our  own  experiences  and  the  way  we  see  them  has 
an  emotional  component  which  determines  how  we  record  the  experience  in  our  brain. 
When  we  observe  a  segment  of  mission  performance,  what  we  observe  depends  upon  all 
of  the  life  experience  precedent  to  that  segment  of  activity.  Thus,  a  major  component  of 
the  effect  we  observe  in  any  mission  performance  depends  on  our  individual  prior  history  - 
which  includes  the  order  of  previous  task  performance,  and  our  individual  sense  of  mission 
continuity  generated  by  those  prior  experiences  as  well  as  the  emotional  component  of  the 
incidents  recorded. 

THE  SMALL  INDEPENDENT  ACTION  FORCES  (SIAF>  EXPERIENCE 

In  1968,  the  Defense  Advanced  Research  Projects  Agency  (DARPA)  authorized 
construction  of  a  data  base  about  activities  of  Small  Independent  Action  Forces  (SIAF) 
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operations  in  Vietnam.  Analysis  of  SIAF  operations  had  shown  that  small  patrols  of  from 
5  to  9  troops  operating  for  time  periods  of  between  2  and  9  days  exterior  to  larger 
organizations  (companies,  or  battalions)  were  more  successful  in  performing  their  missions 
than  were  similar  sized  units  operating  within,  and  under  the  control  of  larger  force 
elements.  Compared  with  traditional  larger  force  elements,  these  units  had  fewer 
casualties,  expended  less  ammunition,  and  reported  higher  numbers  of  enemy  detections 
while  remaining  undetected  themselves.  There  were  four  types  of  small  units  involved:  (1) 
SEALS  (Navy  units),  (2)  Long  Range  Reconnaissance  Patrol  (Marine  units;  (3)  Ranger 
units  (Army);  and  (4)  Special  Forces  units.  The  idea  was  to  determine  how  small  units 
made  operational  decisions  during  patrols  with  emphasis  on  how  perceptions  of  enemy 
forces  influenced  those  decisions.  By  doing  so  in  both  combat  and  non-combat  situations, 
we  hoped  to  provide  a  data  base  permitting  construction  and  test  of  a  combat  model 
which  would  validate  hypotheses  based  on  lessons  learned  in  the  field.  Over  the  period  of 
16  months,  our  team  reviewed  1,500  SIAF  “after  action  reports”  and  collected  data  from 
an  equal  number  of  patrols.  All  of  the  services  were  about  equally  represented. 

Our  first  finding  was  that  SIAF  units  had  developed  the  mission  time  line  shown  in 
Figure  2. 


Figure  2 

SIAF  UNIT  MISSION  TIME  LINE 
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In  Figure  2,  boxes  surround  the  each  element  of  the  SIAF  mission.  The  data 
elements  collected  about  each  mission  element  are  listed  below  each  mission  element  box. 

When  we  collected  data  from  returning  patrols,  we  met  the  patrols  as  soon  as  they 
had  finished  their  “mission  completion  chores”.  Using  the  model  in  Figure  2,  we  took 
minute  by  minute  data  about  each  patrol  from  the  time  they  were  alerted  to  expect  an 
operational  order,  to  their  withdrawal  from  combat  terrain.  We  were  interested  in  how 
patrol  routes  were  selected,  speed  of  terrain  traverse,  reconnaissance  methodology,  target 
detections,  tactical  choice  rationales  and  almost  all  of  the  minutiae  embedded  within  the 
combat  mission  sequence.  We  also  collected  a  full  set  of  demographic  characteristics 
about  the  individual  squad  members  and  their  unique  histories.  At  the  end  of  our  12  month 
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in  country  data  collection  activity,  we  felt  we  could  construct  a  data  base  which  we 
represented  adequately  the  sum  of  experiences  of  small  forces  in  Vietnam  combat. 

To  test  the  reproducibility  of  the  SIAF  experience  in  areas  other  than  Vietnam,  we 
established  a  field  test  site  in  the  Hawaii  National  Forest  on  terrain  similar  to  that  reported 
by  SIAF  units  in  Vietnam.  On  the  experimental  terrain,  we  measured  parameters  of  SIAF 
type  performance  achieved  by  Vietnam-veteran  Army  and  Marine  squads  operating  in  the 
SIAF  mode.  23  patrols  operated  within  a  set  of  simulated  combat  situations.  We 
measured  the  patrol  performance  with  special  instrumentation  borrowed  from  CDEC.  We 
collected  the  same  data  about  participants  as  we  had  collected  in  Vietnam.  When  we 
compared  the  set  of  test  data  from  Hawaii  with  the  Vietnam  Interview  data,  we  noted  that 
velocity  of  movement  over  terrain  for  various  kinds  of  terrain,  and  the  ratio  of  target 
detection  to  numbers  of  target  available  for  detection  were  heavily  dependent  upon  the 
specific  mission  experience  as  it  unfolded,  and  secondarily  on  terrain  parameters. 

We  then  built  a  computer  assisted  game  to  use  at  the  Army  Special  Forces  School. 
Using  Vietnam  combat  veterans  as  test  subjects,  we  would  simulate  a  SIAF  patrol  in  a 
closed  environment.  Simulations  began  upon  delivery  of  a  mission  operational  order.  Data 
was  taken  about  how  mission  details  were  planned  and  executed.  Pictures  of  the  actual 
mission  terrain  were  used  to  help  patrol  leaders  make  key  decisions  about  movement  and 
force  tactics  on  that  terrain.  At  each  step  of  the  way,  the  computer  compared  decisions 
and  actions  of  the  test  troops  with  the  data  derived  from  Vietnam  and  from  the  test  facility 
in  Hawaii. 

When  we  compared  the  three  sets  of  data,  we  reached  an  interesting  set  of 
conclusions.  In  general,  performance  varied  with  insertion  method,  order  of  mission  task 
performance,  perception  of  distributed  enemy  forces,  and  perceived  operating 
environment  (terrain,  weather,  stimuli,  etc.).  Details  of  unit  performance  were  not 
duplicated  even  in  similar  environments,  and  when  proceeding  under  similar  operational 
orders.  The  specific  findings  were 

(1)  for  each  of  the  three  data  bases,  the  decision  rules  which  led  to  the  observed 
results  were  different; 

(2)  the  consolidated  data  base  inclusive  of  all  three  of  the  independently  derived 
data  sets  could  be  bounded;  but, 

(3)  while  a  bound  could  be  defined  for  the  measured  parameters  (rate  of 
movement  over  terrain,  target  detection  ratio,  range  of  detection,  etc.),  that 
bound  was  strongly  dependent  on  the  patrol’s  perception  of  (a)  enemy  troop 
distribution;  (b)  numbers  of  enemy  detections  made  from  insertion  to  the 
time  at  which  measurements  were  made;  and  (c)  the  terrain  characteristics; 

(4)  the  boundaries  which  resulted  from  each  data  set  result  (i.e.,  the  probability 
density  functions)  were  different  because  the  number  of  variabilities 
embedded  in  the  observations  were  so  large  and  depended  so  heavily  on 
individual  troop  perceptions;  and. 
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(5)  because  of  the  we  had  only  a  moving  boundary  and  could  not  suitably 
integrate  the  data  into  a  singular  data  base,  we  were  unable  to  use  the  results 
to  predict  narrow  bounds  for  future  performance  by  similarly  configured 
troops  comprised  of  equal  numbers  of  individuals  with  similar  demographic, 
training,  and  combat  histories. 

We  also  made  some  conjectures: 

(1)  task  performance  may  depend  on  decision  chains:  (short  and  long  term 
learning  effects)  plus  perceptions  of  what  is,  and  has  been,  going  on  during 
the  mission,  test/experiment,  or  game,  and 

(2)  performance  of  the  same  mission  sequences  may  not  be  reproducible  in  the 
same/similar  environments. 


THE  DEFENSE  ACQUISITION  PROCESS 

In  1983,  when  I  became  Professor  of  Engineering  Management  at  the  Defense 
Systems  Management  College,  I  began  to  study  the  process  by  which  the  Services 
purchased  major  weapon  systems.  The  process  at  that  time  was  described  by  the  old 
Department  of  Defense  Directive  5000.1.  It  proceeded  from  the  establishment  of  an 
operational  requirement  through  the  delivery  and  operation  of  end  weapon  items  to  the 
fighting  forces.  Figure  3  shows  the  major  steps  in  the  process. 


Figure  3 

STEPS  IN  THE  DEFENSE  ACQUISITION  PROCESS 


Our  initial  plan  had  been  to:  (1)  collect  detailed  anecdotal  data  on  effect  of 
perturbing  events  on  acquisition  process  performance;  (2)  form  a  cause/effect  data  base  of 
relationships  and  their  effects  on  cost  and  time-line  projections;  and  (3)  Compare  data 
base  results  with  programs  external  to  the  data  base  to  establish  data  utility.  To  implement 
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the  plan,  we  began  to  gather  statistics  on  elapsed  time  between:  (1)  definition  of  an 
operational  requirement  and  formation  of  a  group  to  explore  competing  system  concepts; 
(2)  establishment  of  the  concept  exploration  group  and  selection  of  the  concept  to  be 
developed;  (3)  selection  of  the  concept  to  be  developed  and  completion  of  the  engineering 
prototype;  and  (4)  narrative  descriptions  of  events  which  affected  those  elapsed  time 
intervals. 

We  observed  a  broad  range  of  elapsed  time  intervals  for  each  of  the  acquisition 
process  steps.  When  we  explored  the  “reasons”  why  there  was  such  performance  diversity, 
we  noted  that  many  could  be  traced  to  decisions  during  process  execution  taken  by  those 
outside  the  direct  acquisition  activity  chain. 

One  major  contributing  factor  involved  how  resource  streams  were  allocated  to  the 
program.  Specifically,  gaining  approval  to  proceed  with  concept  exploration  requires 
provision  of  pro-forma  “program  budget”  and  “time  to  complete”  estimates  for  the  entire 
development  program  (which  might  be  expected  to  take  up  to  15  years).  These  estimates 
almost  always  assumed  that:  (1)  resources  would  be  available  as  required;  (2)  the 
concept/engineering/production/sustainment  activities  as  stated  were  those  which  would 
ultimately  be  proven  correct;  and,  (3)  there  were  no  unforeseen  impediments  either  to  the 
order  in  which  all  of  the  activities  described  were  to  be  performed,  or  the  time  line 
projected  for  their  performance. 

However,  it  was  almost  always  the  case  that:  (1)  actual  resource  levels  failed  to 
meet  planning  (both  in  amount  available  and  the  dates  of  availability);  and  (2)  the  lack  of 
timely  resource  availability  or  some  previously  unknown  technical  problems  arose  which 
prevented  the  forecasts  from  being  implemented  as  planned  in  the  initial  assumptions. 

It  also  became  clear  that  the  point  in  time  at  which  program  resources  became 
constrained,  or  technical  assumptions  proved  to  be  in  error  were  major  factors  in 
determining  what  happened  along  the  time  line  from  initial  program  authorization  to 
ultimate  operational  deliveries.  The  effects  observed  on  the  outcomes  of  each  succeeding 
process  step  were  different  depending  mainly  on  those  two  factors.  While  it  was  possible 
to  compute  a  boundary  within  which  all  observed  times  for  step  accomplishment  fell,  we 
did  not  know  how  to  use  that  information  to  predict  what  would  happen  if  resources  were 
constrained  by  some  arbitrary  amount  at  some  random  point  along  the  time-line  of 
program  execution.  There  were  a  great  number  of  additional  variables  which  depended 
upon  the:  (1)  rate  of  change  of  the  end  requirement;  (2)  kind  of  program  being  pursued; 
and,  (3)  rate  of  technology  advance  during  the  development  period.  In  short,  there  were 
so  many  variables  involved  that  we  simply  could  not  collect  data  in  sufficient  amounts  to 
permit  us  to  do  good  statistical  analysis.  We  decided  to  try  another  methodology. 

With  approval,  funding,  and  active  assistance  from  the  Undersecretary  of  Defense, 
Acquisition,  we  held  a  series  of  three  day  workshops  to  explore  in  detail  the  problems 
which  affected  the  acquisition  process  for  smart  munitions  and  other  kinds  of  weapon 
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systems.  Each  workshop  had  10  participants.  All  participants  were  Program  Managers  or 
Deputy  Program  Managers  from  government  (military  and  civilian)  and  contractors.  Their 
senior  staff  members  were  available  to  provide  additional  details  of  pertinent  information. 
During  each  workshop,  participants  would  describe  details  of  the  perturbing  events  they 
experienced,  the  responses  taken  to  those  events,  and  effect  of  the  events  on  program 
performance  experienced. 

As  we  continued  to  explore  the  process,  we  found  great  diversity  in  the  types  of 
problems  experienced  during  program  performance;  and  we  derived  relationships  among 
them.  At  the  conclusion  of  our  work,  we  created  a  data  base  from  all  of  the  workshop 
data.  The  data  base  enabled  us  to  produce  a  relational  model  that  showed  us  how 
problems  experienced  at  some  point  in  the  process  affected  the  outcomes  at  succeeding 
points  along  the  time  line  of  process  performance.  These  relational  models  provided  us 
with  a  good  understanding  of  how  an  event  might  influence  process  performance  in  the 
context  of  the  particular  situation.  But  we  also  discovered  that  the  same  event  (i.e.,  a 
budget  cut  of  10%),  occurring  at  a  different  time  in  the  process  (i.e.,  concept  exploration 
vs.  engineering  development);  in  different  programs  (i.e.,  a  ground  to  air  missile  or  an  air 
to  ground  missile)  even  among  those  using  the  same  technology,  could  produce  very 
different  results  in  terms  of  completion  dates  and  costs  for  succeeding  process  steps. 


SOME  TENTATIVE  CONCLUSIONS 

We  established  some  elements  of  similarity  in  each  of  the  cases  discussed  above: 

(1)  All  of  the  processes  (i.e.  experimentation  at  CDEC,  small  force  combat  in 
Vietnam,  and  exercise  of  the  Defense  Acquisition  Process)  were  complex.,  and  were 
influenced  by  many  variables  which,  in  turn  were  of  complex  structure. 

(2)  Process  steps  generally  were  describable  as  instances  of  repetitive  task 
performance;  the  processes  were  made  up  of  a  set  of  fixed  functional  sequences;  but  the 
functional  sequences  were  performed  within  the  context  of  different  complex  situations: 
situations  which  themselves  were  composed  of  many  inter-related  factors  capable  of 
influencing  the  process  outcome. 

(3)  The  temporal  relationship  between  events  which  affected  the  process  outcome 
and  the  point  along  the  process  time-line  at  which  those  events  occurred  was  variable  and 
often  depended  upon  factors  not  necessarily  embedded  within  the  process. 

We  found  that  in  the  situations  discussed  above,  taking  exactly  the  same  action 
(e.g.,  absorbing  a  budget  reduction  of  10%)  in  different  contexts  (e.g.,  a  “smart 
munitions”  program  or  a  new  aircraft  development  program),  and  at  different  points  along 
the  process  time-lines  (e.g.,  in  concept  exploration  or  engineering  development  and  test) 
produce  situationally  dependent  singular  effects. 
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The  findings  across  all  three  cases  led  us  to  consider  the  question  of  whether 
applying  classical  mathematical  methods  to  these  kinds  of  issues  requires  making  some 
fundamental  assumptions  unlikely  to  be  present  in  the  real  world.  For  example,  is  it 
realistic  to  assume  that;  (1)  there  is  “event  reproducibility”  across  time  and  space;  (2)  the 
population  of  outcomes  is  bounded  or  stable;  or,  (3)  there  is  “central  tendency”  in 
repeated  sampling.  Where  there  is  absence  of  repeatability;  when  the  dimensions  of 
outcomes  are  unbounded;  or  when  the  numbers  of  samples  required  to  generate  acceptable 
levels  of  confidence  are  very  high,  we  might  re-consider  the  use  of  classical  statistical 
approaches.  It  might  be  necessary  to  develop  new  mathematical  concepts  better  suited  for 
addressing  the  kinds  of  issues,  involving  large,  complex  systems  which  incorporate  a 
multitude  of  human-machine  interfaces. 

When  such  conditions  are  present  it  might  be  more  appropriate  to  use  perturbation 
and/or  chaos  concepts  than  to  try  and  satisfy  requirements  for  application  of  standard 
statistical  methodologies  so  useful  in  situations  which  can  be  replicated  reasonably  well.  In 
short,  does  the  richness  of  situational  and  process  complexity  make  each  result  observed 
non-repeatable,  even  though  an  effect  observed  can  be  assigned  to  what  appears  to  be  the 
same  cause  (a  10%  funding  reduction).  We  believe  that  this  kind  of  problem  construct  can 
make  it  necessary  to  find  limits  of  cause  and  effect  rather  than  equations  which  link 
discrete  cause  to  particular  effect. 

Perhaps  a  good  illustration  can  be  found  in  Gleick’s  discussion  of  Foucault’s 
Pendulum  [3]  which  points  out  that  while  the  exact  path  of  the  pendulum  is  not 
predictable,  the  boundaries  within  which  that  path  will  lie  (i.e.,  trajectory  bounds)  are;  and 
the  focus  around  which  all  possible  paths  will  reverse  for  a  bounded  set  of  initial  and 
subsequent  conditions  (the  “attractor”)  can  be  predicted  While  Gleick’s  illustration  is 
relatively  simple,  there  is  no  difficulty  in  finding  additional  complex  situations  in  which  one 
would  like  to  predict  an  outcome.  However,  such  situations  appear  to  be  characterized  by 
non-repeatable  “cause  and  effect”  linkages  within  contexts  of  large  numbers  of  variables 
which  can  effect  the  outcomes. 

There  may  be  yet  another  difficulty  in  applying  some  standard  statistical 
methodologies  to  complex  processes;  interactions  among  complex  processes  and  complex 
environments  may  preclude  the  assumption  of  a  “central  tendency”  in  observations  made 
over  time.  There  may  even  be  difficulties  in  application  of  chaos  concepts  in  such 
circumstances.  It  may  be  that,  over  time,  the  location  of  the  attractor  in  the  complex  space 
of  its  existence,  changes  in  response  to  changed  values  of  parameters  which  determine  its 
location.  Such  occurrence  would  preclude  specifying  fixed  attractor  loci.  Thus,  a  “space 
within  which”  points  will  be  found  (with  no  preference  about  where  in  that  space 
particular  effects  of  particular  causes  will  lie)  might  be  all  that  can  be  postulated  for  each 
iteration  of  a  particular  complex  process  interacting  with  its  environment.  It  may  also  be 
that  as  complex  processes  play  out,  the  boundaries  of  attractor  space  change  to 
acconunodate  changes  in  context  (relationships  among  system  components)  of  the  system 
under  study  and  the  environment  with  which  it  interacts  and  is  affected. 
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One  possible  illustration  of  this  kind  of  situation  may  be  the  efforts  of  Motorola  to 
achieve,  and  then  maintain  a  reduction  in  its  manufactured  component  rejection  rate  by 
tightly  controlling  the  manufacturing  processes  such  that  “a  six-sigma  boundary”  is 
experienced  for  all  of  its  various  manufacturing  operations.  In  describing  what  was  done 
and  how  it  was  accomplished,  Motorola  representatives  said  that  while  the  effort  had  been 
very  successful  in  providing  tightly  controlled  manufacturing  processes,  they  had  also 
observed  that  the  point  about  which  the  tight  control  was  held  (the  mean  value)  would 
shift  over  time.  They  said  that  the  shifting  mean  could  not  be  assigned  simply  to  changes  in 
process  tooling  or  to  change  in  materials  input  to  the  process.  Those  parameters  were 
capable  of  tight  control.  To  the  question,  “What  is  causing  the  shift  in  mean  value  when 
the  boundaries  of  repeatability  are  maintained  so  well?”,  the  response  was:  “We  don’t 
know  and  haven’t  been  able  to  find  out!”  One  can  hypothesize  that  in  this  case;  (1)  the 
attractor  is  the  mean  value  of  items  produced  by  the  process;  (2)  the  boundary  of  possible 
values  is  six  sigma;  and,  (3)  the  shifting  mean  is  the  system’s  response  to  change  to  the 
complex  environment  within  which  the  process  operates. 

There  is  a  very  real  problem  which  arises  if  one  cannot  depend  upon  the  analytical 
results  drawn  from  observations  of  complex  systems  over  time.  It  is  linked  to  the  way  we 
have  tended  to  “solve”  problems  in  the  past.  Increasingly,  we  have  attempted  to  make 
sweeping  sets  of  constraints  on  allowable  behavior  when  certain  situations  arise,  (e.g.,  a 
law,  or  regulation).  Such  prohibitions  create  an  inflexible  set  of  boundaries  which  limit 
response  even  when  the  spectrum  of  allowable  responses  is  inappropriate  to  the 
situation. 

The  issue  is  not  simply  one  of  finding  a  set  of  appropriate  responses  in  a  static  or 
fixed  bounded  situation;  it  is  rather  a  problem  of  attempting  to  take  appropriate  action 
under  conditions  of  change  when  the  time  constants  of  change  for  the  system  and  its 
environment  may  be  considerably  different  from  the  time  constants  of  change  for  the 
system  of  constraints.  The  Defense  Acquisition  process  is  a  good  case  in  point.  Title  10  of 
the  United  States  Code  provides  the  governance  for  how  the  acquisition  process  is  carried 
out  by  members  of  the  Defense  Acquisition  Corps.  The  Congress  is  responsible  to  make 
change  to  the  U.S.  Code  through  the  Legislative  process.  In  the  absence  of  some  very 
compelling  reason,  up  to  a  year  can  elapse  before  a  change  can  be  made  even  when  there 
is  a  feeling  of  urgency  about  the  need  to  make  change.  The  legislative  mechanisms  are 
designed  for  deliberative  process,  rather  than  for  emergency  actions.  Emergency  action 
rules  are  provided  for  in  the  operating  regulations  which  result  from  legislation:  but 
regulations  are  also  slow  to  change  even  under  extreme  conditions  of  perceived  need.  It 
has  taken  more  than  3  years  to  make  any  major  changes  in  the  acquisition  process! 

In  many  instances,  situations  have  arisen  over  the  years  which  indicated  the  need  to 
take  actions  forbidden  under  then  existing  rules  and  regulations.  At  those  times. 
Acquisition  Corps  members  had  some  choices:  (1)  doing  something  not  really  appropriate 
to  the  situation,  but  within  the  boundaries  of  permitted  actions;  or  (2)  doing  something 
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“outside  the  rules”  and  taking  action  they  believe  appropriate  in  that  specific  situation.  The 
dilemma  between  doing  what  is  permitted  and  what  is  deemed  “appropriate  under  extant 
regulation”  can  create  considerable  strain  on  the  acquisition  work  force.  Often,  stepping 
outside  the  boundaries  of  permitted  action  may  resolve  the  immediate  issue,  but  it  lays  the 
individual  open  for  post  facto  analyses  in  which  breaking  rules  becomes  more  important 
than  a  successful  outcome.  If  the  outcome  has  been  less  than  successful,  the  result  of  such 
inquiry  is  likely  to  be  very  bad  for  the  individuals  involved. 

The  presumption  that  a  body  of  statistical  data  can  be  developed  which  can 
provide  the  basis  for  structuring  sets  of  “best  appropriate  actions”  under  all  circumstances 
”  has  not  been  readily  demonstrated  in  these  three  instances. 

A  summary  of  the  conjectures  resulting  from  this  work  can  be  stated  as  follows: 

•  The  processes  studied  in  this  work  are  sufficiently  complex  to  make  data 
reproducibility  doubtful. 

•  Outcomes  appear  to  be  singular  and  to  depend  heavily  on  decision  chains  influenced, 
but  not  predetermined,  by  perturbing  events. 

•  Like  magnitude  perturbations  in  similar  circumstances  produce  different  outcomes 
depending  upon  situational  context. 

•  In  such  processes,  with  their  large  numbers  of  independent  variables,  and  the 
uncertainties  of  variable  dependencies,  cause-effect  linkages  appear  to  be  confounded. 

•  Gaining  sufficient  data  to  develop  confidence  in  predictions  made  from  the  data  base 
may  require  a  long  time  period. 

•  If  long  time  periods  are  required  for  data  collection  and  analysis,  the  environment 
within  which  the  process  operates  may  undergo  considerable  change. 

•  Half-life  of  collected  data  and  the  results  obtained  from  it  may  be  too  short  to  test  data 
base  utility  or  prediction  validity. 


A  SUGGESTED  FUTURE  COURSE 

Suppose  one  were  required  to  develop  an  approach  to  understanding  complex 
systems  in  the  world  of  non-repeatable  cause-effect  observations  where  a  central  tendency 
has  been  replaced  by  a  temporary,  changing  central  value  which  depends  upon  a 
continuing  stream  of  changing  relationships  in  the  system/environment/context  paradigm. 
How  could  one  proceed?  In  a  world  which  is  rapidly  becoming  more  and  more 
interconnected,  it  seems  reasonable  to  predict  that  future  problems  will  be  characterized 
by  increasing  numbers  of  interacting,  rapidly  changing  variables  which  impact  on  the  way 
we  live,  conduct  our  business,  and  make  decisions  for  our  future.  Under  those  conditions, 
there  will  be  greater  need  for  tools  which  permit  treatment  of  complexity  in  a  timely,  yet 
thorough  manner. 

The  statistics  of  the  future  will  require  consideration  of  a  set  of  new  concepts 
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founded  on  characteristics  of  non-repeatable  events  which  take  place  within  changing 
environments.  It  may  also  be  necessary  to  provide  entirely  new  mechanisms  to  integrate 
and  display  the  data  which  are  necessarily  involved  in  those  situations.  Even  today,  human 
beings  are  reaching  the  limits  of  cognitive  capability  when  dealing  with  complex  issues 
described  by  large  amounts  of  information  presented  in  traditional  ways  (e.g.,  matrices,  or 
linear  models).  As  the  need  grows  to  understand  and  integrate  larger  amounts  of 
information,  the  limits  of  cognitive  capacity  will  be  stretched  perilously. 

The  task  for  those  of  us  who  strive  to  develop  and  apply  analytical  tools  to  difficult 
issues,  is  to  consider  expanding  well  beyond  the  envelope  of  traditional  methodologies  and 
finding  ways  to  accommodate  the  needs  of  our  future.  In  such  a  quest,  we  can  again  look 
to  Einstein  and  ask  whether  he  has  anticipated  the  problem.  He  is  known  to  have  said 
often  that; 


“Imagination  is  more  important  than  knowledge!” 
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Abstract 

01well(1997b)  developed  methods  for  the  control  of  processes  which 
have  low  frequency  events,  usually  failures.  Here  we  illustrate  the  meth¬ 
ods  by  considering  three  types  of  isolated  incidents:  Driving  Under  the 
Influence  (DUI)  offenses  from  an  armored  division  in  Germany,  Class  A 
aviation  accidents  in  the  Department  of  Defense,  and  rapes  in  New  York 
City.  We  then  propose  that  these  methods  be  adopted  for  key  command 
interest  items  in  the  military.  The  implications  of  this  proposal  are  briefly 
discussed. 

Keywords:  Predictive  distributions,  short  run,  Poisson,  military,  acci¬ 
dents,  rape. 


1  Introduction  and  Background 

Statistical  Process  Control  (SPC)  is  a  body  of  graphical  techniques  which  sepa¬ 
rates  usual  variation  from  unusual  variation  in  a  process,  based  on  a  model.  We 
will  consider  two  types  of  charts  in  this  paper,  which  identify  isolated  departures 
from  the  model  and  persistent  departures  from  the  model.  These  methods  are 
self-calibrating:  they  learn  from  and  adjust  to  the  historical  record,  and  can  be 
implemented  without  the  large  training  sets  of  data  that  characterize  classical 
SPC. 

We  propose  to  apply  these  methods  to  the  management  of  sexual  harassment 
data.  No  such  data  is  available  in  useful  form  from  military  sources.  To  illustrate 
the  methods,  we  turn  to  three  data  sets  believed  to  be  similar  in  nature  to  data 

*This  research  was  supported  by  the  Army  Research  Laboratory,  the  USMA  Mathematical 
Sciences  Center  of  Excellence  (MSCE)  and  the  Dean’s  Fund  for  Faculty  Research,  USMA. 
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Military  Academy  Approved  for  public  release;  distribution  unlimited.  “The  views  expressed 
herein  are  those  of  the  author  and  do  not  purport  to  reflect  the  position  of  the  United  States 
Military  Academy  the  Department  of  the  Army,  or  the  Department  of  Defense.” 


70 


on  sexual  misconduct.  They  are  rapes  in  a  precinct  in  New  York  City;  driving 
while  under  the  influence  (DUI)  oflFenses  in  an  Army  division  in  Europe;  and 
aircraft  Class  A  aviation  mishaps.  These  examples  illustrate  the  methods  and 
their  generality,  and  provide  scenarios  for  further  illustrating  how  the  methods 
are  useful  for  turning  data  into  meaningful  information  for  the  commander. 

The  technical  underpinning  of  these  methods  was  presented  in  an  earlier 
article  (dwell,  1997b).  Here  we  focus  on  the  applications  and  analysis. 

The  military  has  a  need  for  methods  like  this  which  can  be  applied  to  a  range 
of  problems,  including  but  not  limited  to  regular  misconduct,  sexual  misconduct, 
accident  rates,  equipment  failures,  and  suicides.  In  each  case,  leaders  need  to 
know  if  the  variation  in  the  processes  can  be  classified  as  usual  or  unusual.  Tra¬ 
ditional  SPC  methods  frequently  do  not  apply,  because  the  extensive  historical 
data  needed  to  calibrate  them  is  not  available;  or  because  policy  changes  make 
previous  historical  data  less  relevant  to  current  processes.  When  investigation 
of  sexual  misconduct  in  the  Army  began,  for  example,  there  was  no  useful  his¬ 
torical  data  available  to  calibrate  traditional  charts.  Self — starting  charts  and 
methods  such  as  these  were  needed. 

Similar  situations  occur  in  industry  and  other  levels  of  government. 

Leaders  find  these  charts  easy  to  understand  and  interpret,  as  our  experience 
with  both  uniformed  and  civilian  military  leaders  and  with  the  leaders  of  the 
New  York  City  Police  Department  has  shown. 

As  a  consequence  of  the  Aberdeen  scandals  of  Fall,  1996,  the  Army  has 
been  revisiting  its  methods  for  reporting  and  monitoring  misconduct  especially 
sexual  misconduct.  There  is  no  central  reporting  mechanism  as  of  this  writing. 
Should  data  be  collected,  the  methods  of  this  paper  were  specifically  designed 
for  monitoring  those  reports  and  detecting  both  unusual  reporting  periods  and 
persistent  changes  in  the  reporting  process. 

No  incident  of  misconduct,  sexual  or  otherwise,  is  acceptable.  It  represents 
a  failure,  and  must  be  addressed.  However,  the  Army  has  procedures  in  place 
to  deal  with  its  usual  levels  of  misconduct.  Senior  leaders  who  monitor  these 
procedures  need  tools  to  distinguish  usual  levels  from  unusual  levels,  and  to 
detect  isolated  or  persistent  changes  from  historical  levels.  Leaders  hold  ex¬ 
traordinary  measures  “in  reserve,”  and  must  commit  them  wisely.  It  cheapens 
the  effect  of  extraordinary  measures  to  apply  them  when  they  are  unwarranted. 
Worse,  the  measures  themselves  can  cause  shocks  to  the  underlying  system  and 
destabilize  it.  On  the  other  hand,  failing  to  react  to  extraordinary  events  may 
allow  a  bad  situation  to  become  much  worse.  This  dilemma  is  analogous  to  the 
decision  to  use  an  antibiotic:  its  frequent  indiscriminate  use  markedly  decreases 
its  effectiveness,  yet  its  omission  may  result  in  the  death  of  a  truly  sick  patient. 

The  methods  of  this  paper  provide  methods  for  determining  when  to  im¬ 
plement  extraordinary  measures.  We  illustrate  this  point  with  our  data  sets, 
especially  with  the  Class  A  aviation  mishap  data  from  1997. 

In  other  words,  we  are  arguing  to  apply  the  statistical  thinking  underly¬ 
ing  the  Deming  philosophy  to  the  management  of  data  used  to  describe  key 
command  interest  items  in  the  military.  Our  contribution  is  to  have  developed 
methods  which  work  for  the  application  of  Deming’s  ideas  to  short-run  Poisson 
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2  Example  —  Class  A  Aviation  accidents 

We  turn  to  four  examples  to  illustrate  the  methods.  We  note  that  there  is 
no  useful  data  on  incidents  of  sexual  harassment  in  the  Army  extant,  at  least 
for  the  purposes  of  these  methods.  We  turn  to  similar  examples.  We  examine 
aviation  accidents,  rapes  in  New  York  City,  and  lastly  misconduct  (albeit  DUI) 
in  a  division  in  Germany. 


2.1  Example  —  Army  Class  A  aviation  mishaps 

A  Class  A  aviation  mishap  in  the  military  is  one  that  involves  a  fatality  or 
damage  in  excess  of  one  million  dollars.  For  Fiscal  Year  (FY)  1997,  the  number 
of  Class  A  aviation  accidents  per  week  is  given  in  Table  1.  The  weeks  run  from 
Sunday  to  Saturday,  so  53  weeks  are  reported.  We  are  interested  in  detecting 
unusual  isolated  weeks  and  persistent  shifts  from  the  usual  accident  rate. 

We  note  that  this  data  would  better  analyzed  by  weighting  for  the  number 
of  flying  hours  each  week,  but  that  data  was  not  available  to  the  author. 

We  will  first  apply  the  methods  of  this  paper  to  the  Army  aviation  data  set, 
first  with  a  weak  prior  and  then  with  a  strong  prior.  For  our  weak  prior,  we 
choose  a  =  1  and  /?  =  4.  This  corresponds  to  an  expected  rate  of  0.25  incidents 
per  week,  with  a  great  deal  of  uncertainty  as  to  the  true  value. 

The  charts  for  an  isolated  departure,  persistent  departure,  and  the  posterior 
distribution  at  the  end  of  the  year  are  shown  at  Figures  1,  2,  and  3.  They 
behave  much  more  reasonably.  Interpretation  of  these  charts  shows  that  the 
first  four  weeks  with  accidents  are  signaled  as  unusual.  After  the  first  twenty 
weeks,  enough  evidence  has  accumulated  about  the  underlying  accident  rate  to 
see  that  one  accident  in  a  week  is  not  an  isolated  model  departure. 

We  have  set  the  persistent  chart  with  a  two-sided  average  run  length  of 
250.  Against  that  criteria,  there  is  not  evidence  of  an  upward  shift,  although  a 
less  sensitive  threshold  would  signal  an  upward  increase  in  the  mean  number  of 
accidents. 

We  compare  these  charts  with  those  obtained  using  a  strong  prior,  say  q  = 
10,  and  (3  =  40,  shown  at  Figures  4,  5,  and  6.  This  prior  corresponds  to  almost 
a  year’s  worth  of  data.  We  see  that  the  isolated  chart  (which  now  looks  exactly 
like  the  original  data  plot)  no  longer  displays  a  learning  period,  and  that  there 
is  no  evidence  of  a  persistent  upward  shift. 

2.2  Example  —  DoD  Class  A  aviation  mishaps 

This  example  considers  the  Class  A  aviation  mishaps  for  the  entire  Department 
of  Defense  for  Fiscal  Year  1997,  with  the  data  by  week.  The  data  was  provided 
by  the  Assistant  Deputy  Under  Secretary  of  Defense  for  Safety  and  Occupational 
Health,  and  appears  in  Table  2. 


72 


Week 

Number 

Week 

Number 

0 

28 

1 

0 

29 

1 

0 

30 

0 

0 

31 

0 

0 

32 

0 

6 

0 

33 

0 

7 

0 

34 

1 

8 

0 

35 

0 

9 

10 

36 

0 

10 

1 

37 

1 

11 

0 

38 

1 

12 

0 

39 

1 

13 

0 

40 

0 

14 

0 

41 

1 

15 

0 

42 

0 

16 

1 

43 

0 

17 

0 

44 

0 

18 

1 

45 

0 

19 

0 

46 

0 

20 

0 

47 

0 

21 

0 

48 

0 

22 

2 

49 

0 

23 

0 

50 

0 

24 

0 

51 

0 

25 

0 

52 

0 

26 

0 

53 

1 

27 

0 

Table  1:  Data  on  the  number  of  Class  A  aviation  accidents  by  week  for  the 
United  States  Army,  Fiscal  Year  1997.  Source:  Assistant  Deputy  Undersecre¬ 
tary  of  Defense  for  Safety  and  Occupational  Health,  US  Department  of  Defense. 
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Figure  1:  Predictive  control  chart  for  isolated  departures  with  a  weak  prior 
applied  to  the  Army  aviation  data. 


Figure  2:  Predictive  cumulative  sum  control  chart  for  persistent  departures  with 
a  weak  prior  applied  to  the  Army  aviation  data. 
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PostBrirforhobda 


Figure  3:  Posterior  distribution  for  A  after  week  53  for  the  Army  aviation  data, 
with  a  weak  prior  distribution. 


Figure  4:  Predictive  control  chart  for  isolated  departures  with  a  strong  prior 
applied  to  the  Army  aviation  data. 
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Figure  5:  Predictive  cumulative  sum  control  chart  for  persistent  departures  with 
a  strong  prior  applied  to  the  Army  aviation  data. 


Figure  6:  Posterior  distribution  for  A  after  week  53  for  the  Army  aviation  data, 
with  a  strong  prior  distribution. 
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Week 

m 

Week 

m 

Week 

m 

1 

o 

19 

2 

37 

1 

2 

2 

20 

2 

38 

1 

3 

2 

21 

2 

39 

3 

4 

1 

22 

3 

40 

0 

5 

1 

23 

0 

41 

2 

6 

0 

24 

1 

42 

0 

0 

25 

1 

43 

0 

2 

26 

1 

44 

1 

9 

1 

27 

3 

45 

1 

10 

3 

28 

1 

46 

2 

11 

1 

29 

5 

47 

4 

12 

0 

30 

1 

48 

0 

13 

0 

31 

1 

49 

1 

14 

0 

32 

1 

50 

1 

15 

3 

33 

1 

51 

5 

16 

1 

34 

2 

52 

1 

17 

0 

35 

1 

53 

2 

18 

2 

36 

1 

Table  2:  Class  A  aviation  incidents  by  week  for  FY97,  with  overlap  since  the 
weeks  run  Sunday  to  Saturday.  Source:  DOD. 


We  note  some  obvious  shortcomings  in  the  data  before  we  proceed  to  analyze 
it.  First,  it  seems  reasonable  that  the  expected  number  of  accidents  would  be 
proportional  to  the  number  of  operating  hours.  Unfortunately,  that  data  on 
operational  hours  was  not  provided.  As  a  result,  the  analysis  is  not  able  to 
account  for  decreases  due  to  reduced  flying  hours  (such  as  usually  occurs  during 
the  Christmas-New  Year’s  holiday  period.)  The  data  does  seem  to  be  reasonably 
well  fit  by  the  Poisson  distribution,  however,  based  on  the  Poisson  dispersion 
test. 

We  proceed  to  analyze  the  data,  obtaining  the  Figure  7  and  8. 

Analysis  of  these  graphs  indicates  two  weeks  (29  and  51)  with  unusually  large 
numbers  of  accidents.  The  second  week,  in  September  of  1997,  coincided  with 
increased  press  attention  on  military  aviation  mishaps.  There  was  no  evidence 
of  a  persistent  increase  in  the  underlying  accident  rate,  however. 

This  raises  the  issue  of  what  an  appropriate  response  to  an  isolated  bad  week 
should  be.  Is  a  DOD-wide  stand  down  appropriate  for  an  isolated  week?  Or 
should  that  extraordinary  corrective  measure  be  husbanded  and  used  only  when 
there  is  evidence  of  a  persistent  problem?  Why  was  there  not  a  stand  down  in 
response  to  week  29 ’s  accidents? 

The  answers  to  these  questions  are  informed  by  the  analysis  that  the  methods 
of  this  paper  provide. 
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Figure  7:  The  chart  for  isolated  departures  for  the  Class  A  aviation  data.  Notice 
that  weeks  29  and  51  have  unusually  high  numbers  of  incidents. 

3  Example  —  Rapes  in  New  York  City  in  1996 

3.1  Setting 

Our  second  example  is  provided  by  the  New  York  City  Police  Department. 
Crime  statistics  for  each  precinct  are  collected  weekly.  We  are  interested  in 
the  number  of  reported  rapes  each  week  for  a  given  precinct.  We  assume  that 
the  distribution  of  reported  rapes  follows  a  Poisson  distribution.  (A  test  of 
this  assumption  was  made  on  the  entire  data  set,  using  the  “Poisson  dispersion 
test”  referenced  in  Perry  and  Mead  [1979],  with  p-value  0.225.)  We  set  a  =  .02, 
roughly  corresponding  to  one  false  alarm  per  year  while  in  control.  We  use  a 
vague  prior  initially. 

The  population  of  the  precinct  is  assumed  constant,  as  is  the  length  of  the 
reporting  period.  For  ease,  we  have  set  riiU  =  1,  so  A  is  the  mean  number  of 
arrests  per  week  in  the  precinct. 

The  number  of  reported  rapes  is  provided  in  Table  3. 

3.2  Initial  charts 

The  predictive  control  chart  is  included  at  Figure  9,  and  indicates  that  week  24, 
with  six  rapes,  had  an  unusually  high  number  of  rapes.  This  would  be  cause  for 
investigation  by  the  precinct  commander  and  his  staff. 

The  posterior  distribution  of  A,  after  the  52  observations,  is  at  Figure  11. 
Notice  there  is  still  a  fair  amount  of  uncertainty  about  the  true  mean  rate  of 
rapes.  The  point  estimate  for  A  =  2.28.  The  posterior  distribution  for  A,  after 
only  22  weeks,  is  given  at  Figure  12.  After  22  weeks,  the  point  estimate  for 
A  =  1.9545  =  G22IH22- 
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Week 

N 

Week 

N 

1 

1 

27 

2 

2 

1 

28 

4 

3 

3 

29 

3 

4 

2 

30 

2 

5 

1 

31 

4 

6 

2 

32 

3 

7 

5 

33 

4 

8 

0 

34 

2 

9 

1 

35 

2 

10 

1 

36 

4 

11 

5 

37 

1 

12 

4 

38 

2 

13 

1 

39 

1 

14 

2 

40 

1 

15 

3 

41 

4 

16 

2 

42 

4 

17 

2 

43 

0 

18 

2 

44 

2 

19 

2 

45 

0 

20 

1 

46 

3 

21 

1 

47 

4 

22 

1 

48 

2 

23 

4 

49 

3 

24 

6 

50 

1 

25 

3 

51 

2 

26 

2 

52 

1 

Table  3:  Number  of  reported  rapes  by  week  in  the  75th  Precinct,  New  York 
City  Police  Department,  for  1996.  Data  provided  by  Detective  Josef  Falletta, 
NYCPD. 
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Figure  8:  The  chart  for  persistent  departures  for  the  Class  A  aviation  data. 
There  is  no  evidence  of  a  persistent  shift  in  the  average  number  of  accidents  per 
week. 


3.3  Restarting  the  charts 

We  illustrate  in  this  section  the  ease  of  restarting  the  charts. 

Since  we  have  warmer  weather,  an  indication  of  a  step  change  after  week 
22,  and  an  unusually  large  observation  at  week  24,  we  restart  both  charts  from 
week  23.  We  have  some  history  about  the  process  from  the  first  22  weeks,  so 
we  will  use  an  informed  prior. 

We  believe  that  the  new  A  may  be  m  =  2.5,  plus  or  minus  s  =  .5.  We 
obtained  this  estimate  from  considering  the  increased  average  of  weeks  23-28, 
but  discounting  week  24,  which  was  possibly  an  outlier.  We  are  unsure  about 
the  precision  so  we  use  a  large  value  of  s.  Matching  moments,  we  obtain  a  =  25 
and  (5  =  10.  We  use  these  new  values  for  our  prior  distribution  when  we  restart 
the  charts. 

We  could  have  validly  held  other  beliefs  about  the  parameters. 

For  the  rest  of  the  year,  we  obtain  the  charts  at  Figures  14,  13,  and  16. 
At  the  end  of  the  year,  we  see  that  the  new  point  estimate  for  A  is  given  as 
A  =  2.469,  an  increase  of  26%  over  the  r^.te  for  the  first  22  weeks. 

In  Figure  15,  we  see  that  there  is  no  further  step  change  in  the  reporting 
patterns. 

From  Figure  13,  it  appears  we  underestimated  the  new  value  of  A,  but  the 
chart  adjusts  quickly  to  a  value  closer  to  3  than  to  2.5. 

3.4  Analysis  and  comments 

The  rate  of  reported  rapes  appears  to  have  increased  in  week  23,  and  then  held 
steady  for  the  rest  of  the  year.  This  suggests  that  there  may  be  a  seasonal  com¬ 
ponent  in  the  rape  rates,  related  to  the  summer  months.  However,  analysis  of 
the  city-wide  data  shows  no  corresponding  city-wide  increase,  which  eliminates 
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Figure  9:  Predictive  Control  chart  for  isolated  departures  from  the  expected 
number  of  rapes  per  week  in  the  75th  Precinct,  NYCPD.  There  is  only  one 
signal,  at  week  24. 


the  seasonal  argument.  Additionally,  the  city  as  a  whole  experienced  a  persis- 
tent  and  marked  decrease  in  the  reports  of  rape  starting  at  week  40,  and  this 
precinct  did  not  experience  such  a  drop.  We  discuss  this  in  the  next  section. 

The  signal  in  week  24  may  be  due  to  the  increase  in  A,  not  an  isolated  special 
cause. 

These  methods  only  signal  that  there  has  been  a  change.  They  do  not 
explain  why  —  but  they  do  give  an  indication  of  when  the  shift  likely  occurred, 
which  is  very  useful  for  diagnostic  work.  The  commander  wants  to  know  why 
shifts  occurs.  Modeling  A  as  a  function  of  other  covariates  ~  such  as  weather, 
overtime,  number  of  officers  on  duty,  unemployment  -  would  be  a  useful  exercise 
in  Poisson  regression.  This  would  provide  additional  useful  information  to  the 
precinct  commander. 

3,5  City-wide  charts 

Although  not  developed  in  this  paper,  01  well  (1997a)  has  developed  similar 
predictive  control  charts  for  the  normal  distribution.  The  self-starting  charts  of 
Hawkins  (1987)  (Hawkins  and  Olwell,  1998)  can  also  be  used,  and  are  equivalent 
to  the  predictive  chart  when  the  predictive  chart  has  a  vague  prior  distribution. 
We  assume  a  vague  prior,  and  consider  the  city-wide  data.  The  sum  of  all  the 
rapes  in  all  of  New  York’s  precincts  is  well-modeled  by  the  normal  distribution, 
both  for  empirical  and  theoretical  reasons.  The  charts  for  the  mean  for  the 
city  as  a  whole  is  shown  in  Figures  17  to  support  the  earlier  assertions  that 
the  phenomena  affecting  the  75th  precinct  are  local  and  not  city  wide.  Prom 
the  chart  we  see  that  there  was  no  city  wide  increase  at  week  23,  when  the 
75th  precinct  was  experiencing  an  increase,  and  there  was  a  city  wide  decrease 
beginning  at  week  43  which  was  not  reflected  in  the  Precinct. 

Comparisons  such  as  this  would  be  useful  for  the  Army.  The  analog  is 
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Figure  10:  Chart  for  persistent  departures  for  the  NYCPD  data.  This  chart 
signals  an  upward  shift  in  the  average  number  of  reported  rapes  in  the  precinct, 
beginning  at  week  23,  and  signaled  at  week  33. 


monitoring,  say,  division-level  data  and  Army— level  data  simultaneously,  and 
seeing  if  emerging  trends  were  local  or  Army-wide. 


4  Example  -  DUIs  in  a  division 

This  military  example  is  based  on  the  number  of  incidents  of  Driving  Under  the 
Influence  (DUI)  for  an  armored  division  in  Europe  during  the  first  six  months 
of  1997.  We  assume  that  the  number  of  incidents  for  a  given  reporting  period 
follows  a  Poisson  distribution,  with  mean  proportional  to  the  number  of  assigned 
soldiers  and  days  in  the  reporting  period. 

We  believe  that  the  true  average  number  of  incidents  is  about  10  per  week. 
We  obtained  that  number  by  looking  at  previous  historical  data,  but  we  could 
have  also  obtained  it  by  a  formal  elicitation  of  a  prior  using  the  methods  of 
Section  2.3.  We  elicit  a  =  20  and  /?  =  2,  which  means  that  we  believe  the 
strength  of  our  prior  opinion  is  the  equivalent  of  two  weeks  data. 

We  select  an  ARL  of  500  for  the  run  lengths  for  both  the  isolated  and 
persistent  departure  charts.  The  average  population  of  the  Division  is  19,600 
soldiers,  and  the  reporting  period  is  seven  days.  From  ANYGETH.exe,  we 
obtain  the  decision  intervals  h^)  and  the  values  of  and  /?“  for  the  input 
screen. 

The  preceding  information  is  entered  in  the  input  screen  shown  in  Figure  18. 

4.1  Analysis 

The  isolated  chart  signals  three  unusually  bad  periods,  at  weeks  20,  21,  and 
24.  The  persistent  chart  signals  an  upward  increase  in  the  average  rate,  and 
estimates  that  it  began  at  period  19,  the  last  time  that  S~^  was  on  the  horizontal 


82 


Figure  11:  The  posterior  distribution  for  A  after  all  52  observations  of  the  75th 
precinct  rape  data. 


Figure  12:  The  posterior  distribution  for  A  after  the  first  22  observations  of  the 
75th  precinct  rape  data. 


axis.  Clearly  the  Division  has  experienced  a  significant  change  in  DUI  behavior, 
and  the  leadership  should  investigate  and  react  immediately. 

We  know  the  process  has  changed,  so  we  investigate  why  this  might  have 
occurred.  We  are  greatly  aided  in  the  investigation  by  knowing  about  when  the 
change  occurred.  In  this  case,  this  roughly  corresponds  to  the  time  period  when 
the  troops  of  1st  AD  were  notified  officially  that  they  would  be  redeployed  to 
Bosnia. 

In  the  meantime,  we  should  restart  the  charts  to  monitor  the  process  at  its 
new  level.  We  will  want  to  know  if  the  DUIs  remain  at  this  new  average  level,  if 
they  decrease,  or  if  they  further  increase.  The  restarted  chart  will  give  us  that 
information. 
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Figure  13:  The  restarted  plot  of  the  posterior  mean  and  95%  probability  limits 
for  A  through  observation  52  of  the  75th  precinct  rape  data,  with  informed  prior. 


Figure  14:  The  restarted  predictive  control  chart  for  an  isolated  departure 
through  observation  52  of  the  75th  precinct  rape  data,  with  informed  prior. 

5  Organizational  Implications 

There  is  a  saying  in  the  Army,  “The  unit  does  well  those  things  the  boss  checks.” 
Setting  up  a  monitoring  system  using  the  methods  of  this  paper  can  have  many 
effects  on  an  organization.  They  can  be  good  and  bad. 

Perhaps  the  best  effect  is  that  these  tools  focus  leaders  on  the  distinction 
between  usual  and  unusual  behavior.  Over  reaction  to  a  process  in  control  can 
send  the  process  out  of  control.  The  methods  provide  graphical  evidence  of 
when  extraordinary  intervention  is  and  is  not  warranted.  This  was  illustrated 
in  the  discussion  of  the  DoD  aviation  data. 

Second,  these  tools  provide  sharp  methods  for  distinguishing  between  usual 
and  unusual  behavior,  especially  when  there  are  small  persistent  changes.  We 
saw  that  the  senior  leadership  of  the  New  York  Police  Department  was  unaware 
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Persistent 


Figure  15:  The  chart  for  a  persistent  change,  restarted  after  week  23  for  the 
75th  precinct  data.  Note  there  is  no  evidence  of  a  further  change  in  the  average 
number  of  reported  rapes. 


of  the  persistent  trends  which  had  emerged  in  the  number  of  reported  rapes  at 
precinct  and  city  level  in  1996,  until  these  methods  were  applied  to  the  data. 

Third,  because  the  methods  are  spreadsheet  based,  they  can  be  maintained 
easily  and  require  no  specialized  software.  This  is  a  huge  advantage,  because 
organizations  will  be  less  resistant  to  using  easily  implemented  methods. 

There  are  some  implications  which  are  of  concern.  We  must  continue  to  em¬ 
phasize  that  while  the  level  of  misconduct  or  accidents  might  be  at  a  usual  level, 
no  misconduct  or  accident  is  ever  acceptable.  We  continue  to  work  vigorously 
to  identify  and  eliminate  the  causes  of  these  shortcomings.  That  is  the  function 
of  the  systems  which  the  organization  has  in  place  already  to  deal  with  these 
issues.  These  tools  identify  learning  opportunities  and  periods  where  extraor¬ 
dinary  action  is  warranted.  We  must  be  vigilant  to  avoid  sending  the  message 
that  “usual”  is  “acceptable.”  It  is  not,  if  soldiers  are  being  hurt.  This  really 
requires  increased  education  for  Army  leaders  in  the  power  and  limitations  of 
applying  statistical  thinking  to  managing  the  Army. 

Secondly,  we  must  be  careful  not  to  overburden  subordinate  units  with  re¬ 
porting  requirements.  There  is  a  trade-oj0F  between  the  frequency  of  reports  and 
the  speed  of  detection  of  persistent  shifts.  However,  there  is  an  organizational 
cost  to  meeting  reporting  requirements,  especially  in  the  time  of  subordinate 
leaders  —  an  already  scarce  commodity. 

Third,  we  must  be  sensitive  to  the  effects  of  data  collection.  If  we  contin¬ 
uously  ask  female  soldiers,  ’’Have  you  been  harassed  today?  Have  you  been 
harassed  to  day?”,  we  can  create  a  hostile  working  environment  for  both  the 
questioners  and  the  questioned.  If  we  ask  for  additional  demographic  identifi¬ 
cation,  such  as  race,  we  can  further  aggravate  soldiers. 

Last,  we  must  remind  soldiers  that  the  systems  that  are  not  being  moni¬ 
tored  are  also  important.  Not  charting  may  send  the  message  that  a  process  is 
unimportant,  when  it  might  just  be  unmeasurable. 
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Figure  16:  The  restarted  posterior  distribution  for  A  for  weeks  23-52,  as  of  week 
52,  of  the  75th  precinct  rape  data,  with  informed  prior.  Note  the  shift  to  the 
right  when  compared  with  Figure  12. 
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7  Conclusion 

Discrete  events  well  modeled  by  the  Poisson  distribution  with  mean  proportional 
to  either  time  or  number  of  items  are  everywhere  in  the  Army.  This  paper  has 
shown  a  method  for  monitoring  their  occurrence  to  detect  unusual  periods  or 
persistent  shifts  in  the  underlying  rate. 

These  methods  have  application  to  misconduct  and  other  social  behavior,  as 
well  as  to  weapons  systems,  vehicles,  aircraft,  and  tools,  among  others.  They 
offer  managers  a  means  for  monitoring  the  performance  of  these  key  pieces  of 
equipment,  when  extensive  historical  data  is  missing  or  when  the  process  does 
not  stay  stable  for  long  periods  of  time. 

The  unique  ability  of  these  methods  to  learn  from  the  data,  and  to  be 
restarted  when  necessary,  make  them  well  suited  for  an  ever-changing  Army. 
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Figure  19:  Data  screen  for  Excel  program  for  predictive  Poisson  control  scheme 
for  the  Division  DUI  example. 

The  Excel-based  implementation  makes  them  accessible  to  soldiers  without  re¬ 
quiring  special  software. 

These  methods  are  ready  for  implementation  by  the  Army.  They  have  been 
implemented  into  spreadsheet  form  for  ease  of  use.  As  of  January,  1998,  they  are 
being  fielded  with  the  operations  research  analysts  supporting  the  1st  Armored 
Division  in  Bosnia.  While  they  require  some  training  in  their  setup  and  inter¬ 
pretation,  the  graphical  output  is  easy  to  follow.  The  author  hopes  that  they 
will  soon  be  part  of  the  leader  tool  kit  for  monitoring  key  command  indicators 
in  the  entire  Army. 

8  References 

1.  Aitchison,  J,  and  I.  R.  Dunsmore  (1975)  Statistical  Prediction  Analysis. 
Cambridge:  Cambridge  University  Press. 

2.  Bickel,  P.  J.,  and  Doksum,  K.  A.  (1977)  Mathematical  Statistics:  Basic 
Ideas  and  selected  topics.  San  Francisco:  Holden-Day. 

3.  Geisser,  Seymour  (1993)  Predictive  Inference:  An  Introductionl^ew  York: 
Chapman  &  Hall. 

4.  Hawkins,  Douglas  M.  (1987)  Self-starting  cusums  for  location  and  scale. 
The  Statistician  Vol.  36.  pp.  299-315. 

5.  Hawkins,  D.  M.  and  dwell,  D.  H.  (1998)  Cumulative  Sum  Control  Charts 
and  Charting  for  Quality  Improvement.  New  York:  Springer- Verlag. 

6.  Montgomery,  Douglas  C.  (1991)  Introduction  to  Statistical  Process  Con¬ 
trol  2nd  Ed.  New  York:  Wiley. 


88 


Figure  20:  Chart  for  isolated  departures,  from  Excel  program  for  predictive 
Poisson  control  scheme  for  DUI  example.  This  chart  signals  weeks  20,  21,  and 
24  as  unusually  high  periods. 

7.  Moustakides,  G.  V.,  (1986),  “Optimal  stopping  times  for  detecting  changes 
in  distributions,”  Annals  of  Statistics^  14,  1379-1387. 

8.  Olwell,  David  H.  (1996)  Topics  in  Statistical  Process  Control  Ann  Arbor: 
University  Microfilms. 

9.  Olwell,  David  H.  (1997a)  Predictive  Quality  Control  Charts.  Proceedings 
of  the  Second  Annual  US  Army  Conference  on  Applied  Statistics,  2S-25 
October  1996.  Ed.  Barry  Bodt.  Aberdeen,  MD:  US  Army  Research  Lab. 
Pp.  67-84. 

10.  Olwell,  David  H.  (1997b)  Statistical  Process  Control  of  Low  Frequency 
Events.  Proceedings  of  the  5th  Annual  US  Army  Research  Laboratory  . 
United  States  Military  Acedmy  Technical  symposium.  Ed.  Don  Engen. 
West  Point,  NY:  US  Military  Academy.  Pp.  115-130. 

11.  Olwell,  David  H.  (1998)  Managing  Misconduct:  Statistical  Process  Con¬ 
trol  applied  to  Sexual  Harassment  and  Other  Crimes  in  the  Military.  1997 
Proceedings  of  the  Section  on  Quality  and  Productivity  of  the  American 
Statistical  Association.  Fairfax, VA:  American  Statistical  Association. 

12.  Page,  E.  S.,  (1954),  “Continuous  inspection  schemes,”  Biometrika,  Vol. 
41,  100-115. 

13.  Page,  E.  S.,  (1961),  “Cumulative  sum  charts,”  Technometrics,  Vol.  3,  pp. 
1-9. 

14.  Perry,  J.  N.  and  R.  Mead  (1979)  On  the  power  of  the  index  test  to  detect 
spatial  pattern.  Biometrics.  Vol.  35,  pp.  613-622. 


89 


Figure  21;  Chart  for  persistent  departures,  from  Excel  program  for  predictive 
Poisson  control  scheme  for  DUI  example.  This  chart  signals  a  persistent  increase 
in  the  average  number  of  incidents,  which  began  at  week  19  and  was  signaled 
at  week  21. 

15.  Quesenberry,  C.  R,  (1991),  “SPC  Q  charts  a  Poisson  Parameter  A:  short 
or  long  runs,”  Journal  of  Quality  Technology  Vol.  23,  No.  4,  pp.  296-303. 

16.  Quesenberry,  C.  P.,  (1995),  “On  Properties  of  Poisson  Q  Charts  for  At¬ 
tributes,”  Journal  of  Quality  Technology,  Vol.  27,  No.  4,  pp.  293—303. 

17.  Shewhart,  Walter  A.  (1931)  Economic  Control  of  Quality  of  Manufactured 
Product.  New  York:  Van  Nostrand. 

18.  Tierney,  Luke  (  1990)  LISP-STAT,  New  York:  Wiley. 

19.  Western  Electric  (1956)  Statistical  Quality  Control  Handbook,  Indianapo¬ 
lis,  IN:  Western  Electric  Corporation. 


90 


Using  Pearson  and  Spearman  Statistics  to  Look  for  a  Dependence 
Relationship  between  Two  Variables 

Dr.  Lorrie  L.  Hoffinan,  Department  of  Statistics,  University  of  Central  Florida 
Mr.  Dan  Corson,  IMAGE  intern  jfrom  Florida  A&M  University 


Introduction 

The  purpose  of  this  study  is  to  make  data  analysts  acutely  aware  of  the  differences 
between  two  popular  statistics:  Spearman's  rank  correlation  coefficient  and  Pearson's 
chi-square  test  for  independence.  Both  of  these  statistics  are  used  to  reveal  relationships 
between  two  variables  x  and  y.  Spearman's  will  imcover  linear  relationships,  ie.  where  y 
can  be  adequately  predicted  as  a  linear  function  of  x.  Pearson's  chi-square  is  less  stringent 
in  that  the  relationsh^  need  not  be  linear.  The  simulations  conducted  and  reported  herein 
will  reinforce  the  feet  that  Pearson's  test  can  reveal  a  dependence  between  x  and  y  while 
Spearman's  will  not.  This  occurs  sin5>ly  because  the  relationship  is  not  a  linear  one. 

Both  statistics  are  considered  nonparametric  since  knowledge  of  the  underlying  original 
distribution  of  the  data  is  not  required  in  order  to  know  their  sampling  distributions.  See 
Practical  Nonparametric  Statistics.  2nd  edition,  W.  J.  Conover,  John  Wiley  and  Sons,  New 
York,  1980,  for  a  reference.  The  exact  sampling  distribution  for  Spearman's  rank 
correlation  coefficient  in  the  case  where  x  and  y  are  continuous  can  be  derived 
theoretically  via  a  combinatorial  argument  (Conover,  p.  252).  For  large  sanq)le  sizes  a 
normal  approximation  suffices.  The  exact  distribution  for  Pearson's  test  can  also  be 
generated  by  viewing  the  contingency  table  (the  frequency  of  occurrences  as  defined  in  a 
two-way  classification  table)  as  a  realization  from  a  multinomial  distribution  and 
considering  all  possible  outcomes.  For  adequate  sized  cell  counts  in  the  classification 
(contigency)  t^le  a  chi-square  approximation  suffices  (Conover,  p.  160). 

Recall  that  Spearman  searches  for  relationships  by  relying  on  the  rank  of  each  x  among 
all  x’s  and  of  y  among  all  y’s  and  testing  whether  large  x  ranks  pair  with  large  y  ranks 
(positive  correlation)  or  large  x  ranks  pair  with  small  y  ranks  (negative  correlation). 
Pearson’s  approach  can  be  explained  as  one  which  slices  up  the  3-D  histogram  (the  cell 
coimts  being  the  heights  corre^onding  to  the  varying  x  columns  and  y  rows)  and  then 
checks  to  see  if  those  slices  (near  equivalent  to  conditional  distributions)  are 
proportionally  identical 

Methodology 

We  needed  to  choose  a  femily  of  distributions  which  were  capable  of  exhibiting 
non-linear  dependence.  A  mixture  of  bivariate  normals  meets  this  criteria.  We  have 
restricted  our  attention  to 

f{x,y)  =5  /  ^2*n:*0.-p^)  *  exp((-5  !  -2*  f\*x*y+y^)) 

+5l^2*^*(l-p2^)*Qxp(i-5l(l-p2^y)*(x^ -2*p2*x*y  +  y^)) 

-oo<x<+oo  and 
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That  is,  it  is  equally  likely  we  sample  an  <x,y>  pair  from  a  normal  with  a  zero-vector 
mean,  variances  of  1  and  a  covariance  of  rho(l)  as  to  sample  from  a  normal  with 
zero-vector  mean,  variances  of  1  and  a  covariance  of  rho(2). 

Using  the  suggested  generation  scheme  in  Multivariate  Simulation.  M.E.  Johnson,  John 
Wiley  and  Sons,  New  York,  1985,  and  SPSS  software  version  6.1  for  Windows  we 
generated  300  pairs  fi)r  each  investigation.  We  restricted  ourselves  to  five  cases 
consisting  of  setting  rho(l)  to  .50  and  rho(2)  =  -.75,  -.50,  .0,  .50,  .75  .  We  looked  at 
k=10  runs  of 300  pairs  for  each  of  the  five  cases. 

The  required  steps  to  generate  this  mixed  bivariate  normals  are: 

1.  Generate  U,  where  U  is  from  a  uniform  distribution  on  (0,1) 

2.  Generate  XI 1  from  a  normal  distribution  with  mean  of  0  and  variance  of  1 

3.  Generate  X12  from  a  normal  distribution  with  a  mean  of  rho(l)*Xl  1  and  a  variance 

of(l-rho(l)**2) 

4.  Generate  X21  from  a  normal  distribution  with  mean  of  0  and  variance  of  1 

5.  Generate  X22  from  a  normal  distribution  with  a  mean  of  rho(2)*X21  and  a  variance 

of(l-rho(2)**2) 

6.  X=X11  ifU<=l/2 
Y=X12  if  U<=l/2 
X=X21  ifU>l/2 
Y=X22  if  U>l/2. 

The  Spearman  test  statistic  value  is  calculated  in  SPSS  by  using  the  BIVARIATE 
CORRELATION  macro  under  the  STATISTICS  drop-down  menu  and  selecting  the 
Spearman  option.  The  Pearson  test  statistic  value  is  calculated  by  using  the 
SUMMARIZE-DESCRIPTTVE-CROSSTABS  macro  under  the  STATISTICS 
drop-down  menu  after  classifying  the  x  and  y  values  into  these  categories: 

Category  is  -2  if  x<  -1.0,  category  is  -1  if  -1.0<=  x  <  0.0,  category  1  if  0.0<=  x  <  1.0 
and  category  2  if  x>=1.0  with  an  identical  scheme  for  y.  This  creates  a  2  by  2  contingency 
table  with  16  cells  for  generatnig  Pearson's  chi-square  test  statistic.  Note  that  this 
gridding  decision  is  arbitrary  and  very  well  may  affect  the  ability  of  Pearson's  test  statistic 
to  discern  independence.  This  concern  is  left  to  later  research. 

The  p-values  reported  herein  are  gleaned  from  the  SPSS  output  which  relies  on  the 
two-sided  tail  areas  under  the  approximate  normal  distribution  for  Spearman  and  on  a 
chi-square  approximation  for  Pearson. 

Since  we  know  the  true  original  distribution  (mixed  bivariate  normals)  and  have 
derived  the  conditional  distributions  and  correlation  associated  with  each  then  these 
simulations  are  conducted  in  order  to  assess  the  performance  of  these  test  statistics,  both 
individually  and  in  relationship  to  each  other.  (Note:  here  n=300.  We  could  have  varied 
this  but  believe  it  mimics  studies  we  often  see  om  colleagues  in  Education  and  Business 
conduct.)  We  run  k=10  simulations  for  each  of  the  five  cases  above. 

It  is  direct  to  show  that  the  conditional  distributions  associated  with  ft;x,y)  are: 

/(y|x)  =5 / * exp((-i / (1  - p^^))  *(y-pi* x)^)) 

+  .5  /  ^2*  jT  *  (I  -  /?2^)  *  exp((-.5  /  (I  -  p2^))  *(y  -  p2* 

-  CO  <  y  <+  CO  .y. 
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And  the  correlation  of  x  and  y  is 


.5  *  /?  1  +  .5  *  /?  2 

Recall  that  if  the  conditional  distributions  are  not  identical  for  each  X  then  the 
definition  of  independence  (P(Y=ylX=x)  =  P(Y=y))  is  violated. 

In  none  of  these  cases  is  independence  true  and  in  only  one  case  where  rho(l)=  5  and 
rho(2)  =  -.5  is  the  correlation  zero.  This  statement  is  rei^orced  below  in  the  Results 
section  by  presenting  the  graphs  of  the  conditional  distribution  for  x=  0  and  for  x=l  for 
each  of  the  five  cases,  and  by  calculating  the  correlation  of  x  and  y.  This  tneans  that  both 
Spearman  and  Pearson  should  reject  their  respective  hypotheses  of  HO:  rho  =  0  (except  for 
the  case  where  rho(l)=.5  and  rho(2)=  -.5)  and  reject  HO:  x  and  y  are  independent. 

Results 

The  results  will  be  presented  case  by  case  and  then  discussed  collectivefy  at  the  end 
We  summarize  the  rejection/acceptance  activity  in  Table  1  and  present  the  calculated 
correlation  values  in  Table  2.  For  each  case  we  show  the  following:  1)  a  3-D  wirefiume 
plot  and  2)  a  contour  plot  (both  graphics  generated  by  MENITAB  for  WINDOWS)  plus 
3)  graphs  of  the  conditional  distribution  functions  when  x  =  0  (dashed  line)  and  x  =  1 
(solid  line)  via  SPSS  plots.  Table  1  reports  the  p- values  fi’om  the  k=10  runs  for  Spearman 
and  reports  the  p-values  fi-om  the  k=10  runs  for  Pearson  for  the  cases  where  rho(2)  is 
-.75,  -.50,  and  .0  (omitting  the  rho(2)  =  .50  and  .75  since  the  p-values  for  these  other  two 
cases  were  all  less  than  .0005).  Included  at  the  bottom  of  Table  1  for  both  Spearman  and 
Pearson  are  the  number  of  rejections  of  the  HO  hypotheses  'wdiich  occur  using  an  alpha 
level  of .  1 0.  Note  that  the  smallest  e}q)ected  cell  count  for  all  of  the  contingency  tables 
was  4.3  and  so  the  chi-square  approximation  should  be  adequate.  Table  2  contains  the 
Spearman  correlation  coefficients  calculated  fi-om  the  300  points  for  each  of  the  k=10  runs 
for  every  one  of  the  five  cases. 

For  the  case  vidiere  rho(l)=.50  and  rho(2)=  -.75  the  3-D  plot  presents  a  shape  which  has  a 


Figure  1.  3-D  plot  with  rho(2)=  -.75  Figure  2.  Contour  plot  with  rho(2)=  -.75 
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understood  by  viewing  the  contour  plots  (Figure  2).  This  realization  is  an  obvious  result 
of  combining  an  equal  mix  of  bivariate  normals:  one  with  its  true  regression  line  running 
through  (0,0)  and  (1, 0.5)  and  the  other  one  through  (0,0)  and  (l,-.75).  The  dependence 
between  x  and  y  is  evident  when  one  views  two  arbitrarily  selected  conditional 
distributions:  one  where  x=l  and  the  other  when  x=0.  These  are  presented  in  Figure  3. 

The  graphical  evidence  should  lead  us  to  e3q)ect  that  both  Pearson’s  chi-square  statistic 
(by  noting  the  dissimilarity  of  the  conditional  distributions)  and  Spearman’s  correlation 
coefficient  (by  noting  the  slight  domination  of  a  downward  sloping  tendency  in  the 
contour  plots)  will  uncover  a  dependency  between  x  and  y.  Those  values  are  presented  in 
Tables  1  and  2  below. 


Y 


Figure  3.  Conditional  distributions  at  x=0 
(dashed)  and  at  x=l  (solid) 
rho(2)=  -.75 


Figure  4.  3-D  plot  with  rho(2)=  -.5 


For  the  case  where  rho(l)=.50  and  rho(2)=  -.50  the  3-D  plot  presents  a  shape  which 
has  a  symmetric  four-leaf  clover  look  to  its  cross-section  slices  (Figure  4).  The  difference 
when  conqjaring  to  the  previous  case  is  reinforced  by  viewing  the  contour  plots  (Figure 
5).  This  graphical  portrayal  is  an  obvious  result  of  combining  an  equal  mix  of  bivariate 
normals:  one  with  its  true  regression  line  running  through  (0,0)  and  (1, 0.5)  and  the  other 
one  through  (0,0)  and  (l,-.5).  The  dependence  between  x  and  y  is  evident  when  one 
views  two  arbitrarily  selected  conditional  distributions:  one  where  3C=1  and  the  other  when 
x=0.  These  are  presented  in  Figure  6.  In  this  case  the  conditional  distributions  are 
symmetric.  This  is  easify  confirmed  by  showing  f(y|X=^)  =  f(y|X=-x)  in  equation  2.  The 
graphical  evidence  should  lead  us  to  expect  that  Pearson’s  chi-square  statistic  will  uncover 
a  depaidency  between  x  and  y.  Since  there  appears  to  be  no  propensity  for  the  values  to 
cluster  about  any  line  we  would  expect  Pearson’s  correlation  coefficient  to  be  near  zero. 
Those  test  statistical  values  are  presented  in  Tables  1  and  2  below. 
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Figure  5.  Contour  plot  with  rho(2)=  -.50  Figure  6.  Conditional  distributions 

at  x=0  (dashed)  and  at 
x=l  (solid);  rho(2)=  -.50 


For  the  case  where  rho(l)=.50  and  rho(2)=  .0  the  3-D  plot  presents  a  shape  which  has 
an  elliptical  look  to  its  cross-section  slices  (Figure  7).  This  difference  from  previous 
shapes  is  best  seen  in  the  contour  plots  (Figure  8).  This  graphical  image  occurs  due  to 
combining  an  equal  mix  of  bivariate  normals  with  circular  cross-sections  (the  one  with 
zero  correlation)  and  the  other  bivariate  normal  with  its  regression  line  running  through 
(0,0)  and  (1,  .5). 


Figure  7.  3-D  plot  with  rho(2)=  .0  Figure  8.  Contour  plot  with  rho(2)=  .0 

The  dependence  between  x  and  y  is  evident  when  one  views  those  two  arbitrarily  selected 
conditional  distributions.  These  are  presented  in  Figure  9.  In  this  case  the  conditional 
distributions  exhibit  dissimilarities:  at  3^1  there  is  more  chance  of  observing  positive 
y-values  (note  the  larger  area  to  the  right  of  y=0  than  to  the  left)  whereas  wiien  x=<)  it 
appears  equalfy  likely  one  would  observe  a  positive  or  a  negative  y-vahie.  The  graphical 
evidence  points  to  a  significant  Pearson’s  chi-square  statistic.  Since  there  appears  to  be 
clustering  about  a  line  with  positive  slope  then  we  would  expect  Pearson’s  correlation 
coefficient  to  be  positive.  Those  test  statistical  values  are  presented  in  Tables  1  and  2 
below. 
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Figure  9.  Conditional  distributions  at 

>?=0  (dashed)  and  at  x=l  (solid) 
rho(2)=  .0 


Figure  10.  3-D  plot  with  rho(2)=  .50 


For  the  case  where  rho(l)=.50  and  rho(2)=  .50  the  3-D  plot  is  merely  the  bivariate 
normal  with  a  zero  mean  vector,  variances  of  1,  and  a  correlation  coeflScient  equal  to  .50 
(Figure  1 0).  The  distinction  of  previous  shapes  may  now  be  best  understood  by 
con^iaring  them  to  this  ‘non-mixed’  bivariate  normd  (Figure  1 1). 


Figure  1 1 .  Contour  plot  with  rho(2)—  .50  Figure  12.  Conditional  distributions 


at  xr=0  (dashed)  and  at 
x=l  (solid);  rho(2)=  .50 

The  dependence  between  x  and  y  is  evident  when  one  views  those  two  arbitrarily  selected 
conditional  distributions.  It  is  also  well  understood  theoretically  since  we  know  that  the 
conditional  distributions  of  a  bivariate  normal  are  also  normal  with  a  mean  and  variance 
equal  to,  respectivefy,  (4 ) 

+p*{<7y  /o;)*(x-/4)  p^) 

So  in  this  case  when  x=0  Figure  12  presents  a  normal  with  a  mean  of  0  and  variance  of  .75 
and  viien  x=l  a  normal  with  mean  of  .50  and  variance  of  .75.  Since  the  distributions  are 
not  identical  we  conclride  that  there  is  a  dependence  between  x  and  y  and  should  calculate 
a  Pearson’s  chi-square  statistic  which  is  significant.  Visually  we  see  a  clustering  about  a 
line  with  positive  slope  and  thus  ejqsect  Pearson’s  correlation  coefficient  to  be  positive  atiH 
can  be  found  in  Table  2  below. 
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Figure  13.  3-D  plot  with  rho(2)=  .75  Figure  14.  Contour  plot  with  rho(2)=  .75 


For  the  last  case  where  rho(l)=.50  and  rho(2)=  .75  the  3-D  plot  appears  extremely 
narrow  (Figure  13).  This  concentrated  clustering  about  a  regression  line  with  positive 
slope  is  quite  evident  in  Figure  14. 
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Figure  15.  Conditional  distributions  at  xr=0  (dashed) 
and  at  i^l  (solid)  ;  rlio(2)=  .75 

The  dependence  between  x  and  y  is  showcased  by  viewing  those  two  arbitrarily  selected 
conditional  distributions.  These  are  presented  in  Figure  15.  In  this  case  the  conditional 
distributions  exhibit  dissimilarities:  at  x^l  there  is  a  very  large  chance  of  observing 
positive  y-values  (note  the  larger  area  to  the  right  of  y=0  than  to  the  left)  whereas  when 
x=0  it  appears  equally  likefy  one  would  observe  a  positive  or  a  negative  y-value.  A 
significant  Pearson’s  chi-square  statistic  should  result.  Since  there  appears  to  be  clustering 
about  a  line  with  positive  slope  then  we  would  expect  Pearson’s  correlation  coefficient  to 
be  positive  (Table  2). 
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0.000 

4 

0.006 

0.000 

0.185 

0.001 

0.000 

0.000 

5 

0.210 

0.000 

0.298 

0.043 

0.007 

0.075 

6 

0.053 

0.003 

0.487 

0.023 

0.000 

0.008 

7 

0.264 

0.000 

0.858 

0.067 

0.000 

0.005 

8 

0.000 

0.000 

0.293 

0.026 

0.001 

0.145 

9 

0.059 

0.003 

0.128 

0.051 

0.000 

0.000 

10 

0.062 

0.005 

0.495 

0.441 

0.043 

0.000 

#nefct 

8 

10 

0 

8 

10 

8 

Table  1.  Pearson  and  Spearman 
p- values  when  rho(l)=.5 
and  rho(2)  varies 


rho(2) 

-0.750 

-0.500 

0.000 

0.500 

0.750 

run 

1 

-0.109 

-0.001 

0.221 

0.476 

0.608 

2 

-0.193 

-0.005 

0.167 

0.416 

0.570 

3 

-0.197 

-0.066 

0.177 

0.439 

0.583 

4 

-0.157 

-0.077 

0.264 

0.414 

0.561 

5 

-0.073 

-0.060 

0.156 

0.496 

0.629 

6 

-0.112 

0.040 

0.225 

0.466 

0.593 

7 

-0.065 

-0.010 

0.301 

0.529 

0.654 

8 

-0.223 

0.060 

0.193 

0.474 

0.612 

9 

-0.109 

-0.088 

0.321 

0.576 

0.681 

10 

-0.108 

0.039 

0.199 

0.416 

0.593 

ave 

-0.135 

-0.017 

0.222 

0.470 

0.608 

rho 

-0.125 

0.000 

0.250 

0,500 

0.625 

Table  2.  Calculated  rho  values 
with  theoretical  rho 
value  (at  bottom) 


Conclusions 

In  each  of  these  cases  involving  mbctures  of  bivariate  normals  Spearman’s  rho  does  a 
fine  job  of  estimating  the  true  imderlying  correlation  coefficient  as  exhbited  in  Table  2. 
This  translates  to  excellence  in  testing  where  we  see  in  Table  1  that  Spearman  mostly 
makes  the  correct  decision  on  whether  to  accept  or  reject  HO:  rho  =  0.  Pearson  does 
equally  as  well.  Further  investigations  should  delve  into  the  particulars  relating  to  the 
power  of  these  tests.  That  was  not  our  purpose  here. 

Our  purpose  is  to  spotlight  the  data  in  Table  1  in  the  center  columns  associated  with 
the  mixture  of  bivariate  normals  with  a  correlation  coefficient  of  zero  (Le.  when  rho(l)=  .5 
and  rho(2)=  -  .5  ).  Spearman  concludes  correctly  that  rho  is  zero  (in  1 0  of  1 0  runs  at 
alpha  of .  1 0)  and  Pearson  concludes  correctly  that  x  and  y  are  dependent  (in  8  of  1 0  runs 
at  a^ha  of  .10).  Note  that  based  on  Spearman  some  data  analysts  might  erroneously 
report  that  x  and  y  are  'independent'  rather  than  correct^  saying  'uncorrelated'  as  is  true  in 
this  instance.  This  &lse  deduction  occurs  due  to  an  error  in  thinking  A  analyst  may 
forget  that  linear  dependence  is  not  the  only  kind  of  dependence.  In  the  other  four  cases 
we  present  here  either  Pearson  or  Spearman  statistics  adequately  reflect  the  dependence 
characteristic  inherent  in  the  data,  but  in  the  case  of  our  mixed  bivariate  normals  with 
rho(l)=  .5  and  rho(2)=  -.5  a  researcher  must  pursue  an  avenue  of  investigation  beyond 
merely  asking  about  linear  correlation  in  order  to  reveal  a  dependence  relationship.  This 
cautionary  note  was  the  inqietus  for  this  work  of  finding  ready  examples  of  non-linear 
dependence  and  the  subsequent  characterization  of  those  distributions. 
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Abstract 

In  some  industrial  applications  one  compares  a  sample  mean  and  minimum,  or  a  mean 
and  maximum,  to  reference  values,  and  determines  if  the  lot  from  which  the  sample  was 
taken  is  acceptable,  or  if  further  investigation  of  this  lot  is  indicated.  Because  the  exact  joint 
distribution  of  an  extremum  and  the  mean  of  a  normal  sample  is  complicated,  establishing 
these  reference  values  using  statistical  considerations  typically  involves  crude  approxima¬ 
tions  or  simulation,  even  under  the  assumption  of  normality.  The  purpose  of  this  article  is 
to  use  the  saddlepoint  method  to  develop  a  fairly  simple  and  very  accurate  approximation 
to  the  joint  cdf  of  the  mean  and  an  extremum  of  a  normal  sample.  Tables  for  use  in  estab¬ 
lishing  acceptance  criteria  are  also  provided,  and  the  use  of  these  tables  is  illustrated  with 
an  example. 

KEY  WORDS:  saddlepoint,  control  chart,  acceptance  testing 

INTRODUCTION 

Let  Yi^i  =  1, . . .  ,n  be  an  n’d  sample  from  a  normal  population,  and  denote  the  sample  order 
statistics  as  y(i)  <  1(2)  <  . . .  <  Y^^).  In  some  industrial  applications,  one  compares  and  y, 
or  y^Ti)  and  Y,  to  reference  values,  to  determine  if  the  lot  from  which  the  sample  was  taken  is 
acceptable,  or  if  this  lot  should  be  investigated  further. 

^  The  results  of  this  article  are  potentially  applicable  in  diverse  industries.  For  example, 
emphasis  has  long  been  placed  on  checking  sample  means  and  minima  of  lots  of  various  packaged 
goods  (Brickenkamp  et  al.,  1988;  Croarkin  and  Yang,  1987).  Also,  the  sample  means  and  minima 
are  used  in  the  testing  of  batches  of  raw  material  by  many  manufacturers  of  composite  materials. 
And  means  and  maxima  of  power  loss  of  sampled  motors  have  been  proposed  for  use  in  testing 
whether  manufactured  motors  comply  with  labeled  motor  efficiencies  (Stricklett  and  Vangel, 
1997). 

If  these  reference  values  are  established  using  statistical  considerations,  then  it  is  likely  that 
either  crude  approximations  or  simulation  will  be  employed,  because  the  exact  joint  distribution 
of  an  extremum  and  the  mean  of  a  normal  sample  is  complicated,  with  complexity  increasing 
rapidly  with  n  (Murty  and  Bissinger,  1982).  The  purpose  of  this  article  is  to  develop  an  ap¬ 
proximation  for  this  joint  cdf  based  on  the  saddlepoint  method  (see,  e.g.,  Barndorff-Nielsen  and 

^Approved  for  public  release;  distribution  is  unlimited. 
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Cox,  1989,  Section  4.3),  which  is  very  accurate  and  straightforward  to  compute.  The  use  of  this 
distribution  in  acceptance  testing  is  illustrated  with  an  example. 

AN  APPROXIMATE  JOINT  CDF  fOR  Fd)  AND  Y 

Assume  that  the  random  variables  F  are  iid  normal  with  mean  fi  and  variance  Since  we 
will  assume  that  /i  and  a  can  be  regarded  as  approximately  known  from  previous  data,  we 
can,  without  loss  of  generality,  employ  the  standardized  sample  X,  =  (F  -  fj,)/cr,  having  order 
statistics  X(i)  =  (F(i)  - 

Let  denote  the  cumulative  distribution  of  Xd),  and  let  Fx{t)  be  the  cdf  of  X;  that 

(t)  =  Pr(Xd)  <  t)  =  1  -  (1  -  #(f))F 

and 

Exit)  =  Pr(X  <  t)  =  $  (v^t) , 
where  $(•)  is  the  standard  normal  cdf. 

The  primary  objective  of  this  paper  is  to  develop  and  illustrate  an  approximation  to 

^  X  <t2)- 


Let  (p{t)  denote  the  normal  density,  and  let 


hit)  = 


be  the  normal  hazard  function.  In  the  Appendix,  we  derive  the  saddlepoint  approximation 
for  ti<t2,  where 


HV^t2)Ait)dt  +  /“  #  { [h  +  ^ih(t)  -  t)] }  Ait)dt 

ir^Mt)dt 


(1) 


A{t)  =  /i  ^^(t)exp 


’y/l  -  h^{t)  “f  th{t), 

and  where  is  the  (unique)  solution  to  the  equation 

n  —  1 , 


-U)  =t2-ti. 


(2) 


The  approximate  cumulative  distribution  (1)  satisfies  the  conditions  of  a  bivariate  cdf.  The 
derivation  appears  in  an  appendix  because  it  is  fairly  technical;  however,  the  details  are  elemen¬ 
tary,  and  somewhat  interesting. 

A  contour  plot  of  ^X(i),x(^i5^2)  for  sample  size  n  =  5  is  displayed  in  Figure  1.  Note  the 
rather  sharp  ‘corners’  on  the  contours:  if  ^2  >>  then 


Exi^y,dc(.tut2)  »  Fx^,^,x{tuoo)  =  Fx(.,(«i), 

and  if  F  <<  t2, 

^  ■Px(i),x(-oo>*2)  =  Fx{t2). 

When  choosing  reference  values  to  be  used  for  acceptance  testing,  one  would  typically  try  to 
make  essential  use  of  both  the  mean  and  the  extremum;  this  corresponds  to  being  near  the  corners 
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of  the  contours  in  Figure  1.  One  reasonable  condition  to  impose  is  that,  if  =  a, 

then 

=  Fjiih)  =  al2.  (3) 

This  ensures  that  if  a  future  lot  comes  from  the  same  population  as  the  lots  used  to  establish 
the  acceptance  criteria,  then 

Pr  (^(1)  <  A*  +  her)  =  Pr  (?  <  /i  +  ^2^)  • 

The  {ti,h)  pairs  for  which  this  is  true  are  given  by  the  equation 

=  (4) 

For  n  =  5,  (4)  is  displayed  as  the  broken  curve  in  Figure  L 

THE  ACCURACY  OF  THE  APPROXIMATION 

The  saddlepoint  approximation  used  in  the  derivation  of  (1)  is  asymptotic  in  n,  and  hence 
can  be  expected  to  be  accurate  for  large  n.  However,  the  saddlepoint  method  often  leads 
to  approximations  which  are  reasonably  accurate  for  all  sample  sizes,  and  this  turns  out  to 
be  the  case  for  Fx^i),x{hyh)-  As  an  example,  if  n  =  3,  =  —2.5,  and  h  =  -1.20404 

(which  is  the  value  given  by  (4)),  then  from  a  1,000, 000-replicate  simulation,  we  have  that, 
—1.20404)  ^  .005421  with  an  approximate  95%  (binomial)  confidence  interval  of 

.005274  <  (-2.5, -1-20404)  <  .005567. 

For  this  example,  .P(-2.5,  —1.20404)  =  .005226  is  only  slightly  below  the  simulation  uncertainty 
interval. 

Rather  than  performing  a  large  simulation,  we  have  relied  on  checking  special  cases  (such  as 
the  above  example),  and  on  comparison  with  the  exact  result  for  n  =  2.  If  n  =  2,  it  is  easy  to 
show  that  ^ 

Fx,,„x{hM)  =  2  f'  H2t2  -  a:)(l  -  nxmx)dx.  (5) 

J  —  exj 

Contours  of  (5)  are  displayed  in  Figure  2,  along  with  the  corresponding  contours  of  (1).  Note 
that  the  approximation  is  quite  accurate,  even  for  this  ‘worst  case’  sample  size. 

TABLES 

We  provide  tables  for  the  following  situations. 

1.  One  intends  to  reject  a  lot  if  either  y(i)  <  jj,  —  kia  or  Y  <  fi  —  k2cr;  or  else 

2.  one  plans  to  reject  if  either  >  /x  +  or  Y  >  //  + 

We  impose  the  condition  that  if  an  error  of  the  first  kind  is  made,  then  it  is  equally  likely  that 
we  are  required  to  reject  because  of  a  sample  extremum  as  it  is  that  we  must  fail  because  of  the 
sample  mean.  That  is,  calculations  for  the  tables  employ  the  constraint  (4). 

Tables  1  and  2  provide  values  ki  and  ^2  such  that 

Q{ki,k2)  =  A;i)  +  Fx{—k2)  -  —^2)  =  ck,  (6) 

and 

Fx,,,i-h)  =  F^i-k2)=a/2, 

where  the  probability  (a)  and  sample  size  (n)  correspond,  respectively,  to  rows  and  columns  in 
the  tables.  The  tabulated  function  Q{ki,k2)  has  the  property  that,  within  the  accuracy  of  the 
saddlepoint  approximation, 

Q(^i5  fe)  ^  Pr  (Y(n)  >  M  -b  kia  or  Y  >  /x  +  k2(7)  =  Pr  (Y(i)  <  /x  —  kicr  or  Y  <  /x  —  ^20-)  . 
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POWER 


To  illustrate  the  power  of  an  acceptance  test  based  on  the  mean  and  an  extremum,  we  let  n  =  5, 
a  =  .001,  and  we  require  that  (3)  holds  under  the  null  hypothesis.  Denote  the  parameters  of  the 
normal  distribution  under  the  null  hypothesis  by  and  under  the  alternative  by 

Using  the  function  Q  defined  in  (6),  it  is  easy  to  show  that  the  power  of  this  test  is 

7ri(a,  b)  =  Pr  [y(i)  <  /io  kiao,  ?  <  ^  ^2<^o]  =  Q  +  ^i)>  “b  k2)] ,  (7) 

where  a  =  (/i  -  /io)/^o,  ^  =  cto/c?-,  h  =  3.7132,  and  k2  =  1.4687.  We  compare  this  function  with 
the  power  of  a  test  based  on  the  mean  alone, 

7T2  (a,  6)  =  Pr(y  <  /X  “  Zccoly/n)  (8) 

=  f  [-h{ay/n  +  Za)]  , 

where  ==  3.0902  is  a  normal  quantile. 

Contours  of  the  power  differences 


5{a,h)  =  7ri(a,  6)  -7r2(a,6) 

are  displayed  in  Figure  3.  The  only  situation  where  the  test  based  on  the  mean  has  appreciably 
more  power  than  that  which  used  both  mean  and  an  extremum  is  when  a  «  -k2^  and  i  >  1.  One 
can  see  from  (7)  that  when  a  =  -^2,  then  7ri(a,6)  depends  on  the  data  only  through  Y(^i).  On 
the  other  hand,  when  b  is  substantially  less  than  1,  S{a,  b)  can  be  large.  So  the  use  of  the  mean 
and  an  extremum  is  preferable  to  the  use  of  the  mean  alone  if,  in  addition  to  shifts  in  the  mean, 
one  is  concerned  about  detecting  increases  in  the  standard  deviation.  (The  use  of  an  extremum 
also  provides  obvious  protection  against  spurious  extreme  observations,  but  this  more  a  matter 
of  robustness  than  power.) 


An  Example 

As  an  example  of  how  the  results  of  this  paper  can  be  used  in  practice,  we  consider  the  problem 
of  testing  incoming  batches  of  raw  material.  Aerospace  composite  materials  are  often  fabricated 
from  large  rolls  of  resin-impregnated  graphite  fiber.  In  order  to  determine  whether  a  newly- 
arrived  roll  is  a  cause  for  concern,  specimens  are  made,  and  various  mechanical  and  chemical 
properties  are  tested.  For  strength  properties,  it  has  become  common  in  this  industry  to  require 
that  both  the  average  and  minimum  of  the  test  data  exceed  certain  critical  values.  Usually  these 
threshold  values  are  arrived  at  in  a  somewhat  ad-hoc  manner.  We  suggest  here  an  alternative. 

Figure  4  displays  actual  compressive  strength  data  for  n  =  5  specimens  from  each  of  the  23 
initial  batches  of  raw  material  purchased  for  an  aircraft  application.  The  first  four  lots  of  raw 
material  were  used  in  initial  qualification.  The  mean  and  standard  deviation  of  these  twenty 
values  are  x  =  143.95  and  s  =  8.29,  respectively.  We  assume  that  an  acceptable  probability  of  a 
Type  1  error  is  a  =  .001,  and  obtain  the  values  ki  =  3.7132  and  k2  =  1.4687,  from  Tables  1  and 
2,  respectively.  The  acceptance  limits  are  then  calculated  as 

Ai  =  X  -  kis  =  113.2 


and 


A2  =  X  -  k2S  =  131.8. 


Figure  4  show  the  data  for  all  23  lots,  plotted  against  time.  It  can  be  seen  that  none  of  the 
means  and  minima  of  data  from  lots  5-23  fall  below  their  respective  acceptance  limits;  hence 
there  is  no  indication  provided  to  examine  any  of  these  lots  further. 


CONCLUSIONS 
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A  saddlepoint  approximation  to  the  joint  distribution  of  the  mean  and  an  extremum  leads  to 
a  bivariate  lot- acceptance  chart  which  should  prove  useful  in  industrial  applications  where  one 
would  like  protection  against  very  low  or  high  individual  values,  as  well  as  against  shifts  in  the 
mean.  The  approximation  has  been  shown  to  be  accurate  for  all  sample  sizes,  and  with  the 
provided  tables  it  is  easy  to  use.  A  power  analysis  suggests  that  tests  based  on  the  mean  and 
an  extremum  are  typically  at  least  comparable  in  power  to  tests  based  on  the  mean  alone,  and 
they  can  be  substantially  more  powerful  in  detecting  increases  in  standard  deviation,  and  robust 
against  spurious  extreme  values. 

APPENDIX:  DETAILS  OF  THE  SADDLEPOINT  APPROXIMATION 


The  approach  which  we  will  use  begins  with  the  observation  that 


Pr  (X<f2|X(i)  =ii)  =Pr 


En  —  1 
2=1 


Ui  +  ti 


n 


<<2 


where  the  are  iid  truncated  normal  random  variables,  with  support  on  the  interval 

[ti,  oo).  Let  U  =  denote  this  sum.  The  cumulant  generating  function  of  Ui,  and  its 

first  two  derivatives,  are  easily  shown  to  be 

K{t)  =  log[l-§(ti -t)]-log[l-$(ii)]+iV2 

K\t)  =  h{ti  —  t)  +  t,  and 

K^\t)  =  {ti  —  t)h{ti  -t)  —  ~  t)  4- 1. 


The  saddlepoint  approximation  to  the  density  of  is 

As+(n-l)/C(A) 

/c/(s)  «  . ■  ... — . . 

^  ^  A/27r(n  -  l)K"{X) 

where  A  is  the  (unique)  root  to  the  equation 

i^'(A)  =  s/(n-l). 


(9) 


(10) 


The  approximate  equality  symbol  («)  used  above  will  appear  in  several  places  in  this  appendix. 
We  take  this  opportunity  to  point  out  that  the  only  approximation  made  in  these  derivations  is 
the  replacement  of  the  true  distribution  of  U  with  its  saddlepoint  approximation,  as  in  (9). 

The  joint  density  of  X  and  can  be  expressed  as 


^dt2 


Pr(X<t2iX(i)  =ti) 


/X(i)(^i) 


nfu{nt2  -  h)  [n4>{ti){l  -  #(ii))” 


Substituting  in  (9) ,  we  have  that 


fl/{nt2  -  ii)  » 


g— A(nt2— ti) 

^/2w{n  -  l)K^  (27r)("-i)/2/i"-i(ti  -  A) 


so 


y^2g— 

(27r) (”+!)/ V(n  -  l)K"iX)h^-'^iti  -  A) 
v?  exp  {-f  [t\  -  2[ti  -  X){t2  -  h)  +  2ti{t2  -  ti)] } 
(27r)("+i)/V(fi  -  -  A) 
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By  subtracting  ti  from  both  sides  of  (10),  one  can  see  that  the  saddlepoint  X  is  the  root  of 
the  equation 

Hh-x)-ih-x)  =  -^it2-h).  (11) 

n  —  i 

Thus,  ti  -  A  is  a  function  of  t2  -  ti,  which  we  denote  q(t2  -h).  Since  K”{X)  is  a  function  of 
ti  —  X  alone,  it  can  be  regarded  as  a  function  of  t2  —  ti,  say  L{t2  —  h).  For  the  same  reason, 
h{ti  -  X)  is  also  a  function  of  t2  -  h,  which  we  will  denote  h(t2  -h). 

This  motivates  the  change  of  variables  Z  =  -Ar(i)  and  W  =  X  —  ■Ar(i) ,  which  has  unit  Jacobian 
and  which  leads  to 


fw,z{w,z) 


■  exp  [-f  {z^  +  2zw  -  2wq{w))] 

■2p-n{z+w)^ /2„n{w^+2wq{w))/2 


(27r)(”+l)/2yj)(^J))(^^Triy/in-l(i(;) 

The  approximate  density  (12)  need  not  integrate  to  one.  We  thus  modify  it  by  dividing  by 


the  normalizing  constant 


Cn=  dw  \  dz  fw. 


^(w,z)  =  [ 
Jo 


'  ri^/2 

dxu  f  ^  3  ^ 

(27r)’^/^  yjL{w){n  —  l)h^~^{w) 


By  definition,  h{t)  -  t  =  wn/{n  -  1),  and  h{t)  -  t  is  monotone.  This  suggests  the  change  of 
variable 


with  Jacobian 


dyj  =  Hi — 1  \h^{t)  —  th(t)  —  l| 


Now  we  can  write  (13)  as  an  integral  in  eliminating  the  need  to  determine  A: 


^  h-(”-i)(f)  exp  ll^^^(h(t)  -  tf  +  (n  -  l)t{h{t)  -  t) 


(27r)^/2  J_^ 

*yi  —  4"  th{t')dt. 


Pr(X(i)  <  ti  and  W  <  ta)  =  Pr(.^  <  ti  and  W  <  ta  -  h) 

+  Pr(Z  +  W  <t2  and  W  >t2  -  ti), 

we  can  express  the  approximate  cumulative  distribution  of  ATfi)  and  X  as  the  following  sum  of 
two  integrals  in  the  (WF,  Z)  plane: 

p  =  Pr(X(i)  <  h  and  X  <  *2)  (17) 


'  rt2—t\  rt\  poo  pt, 

/  dw  dz-\-  dw 

.Jo  J  — 00  Jt2—ti  j —  i 


'  1  1  7^2g-n(z+ii;)^/2gn(u;^+2Ti;g(ti;))/2 

dz - : —  ■  . - . 

J  Cn  (27r)(-+i)/V(^  “  l)L{w)h^-^{w) 


where  an  obvious  notation  has  been  used  to  indicate  the  region  of  integration.  As  was  the  case 
in  the  derivation  of  Cn,  integration  in  2;  can  be  performed  explicitly,  after  which  (17)  becomes 

p  —  - ^  \^Jn{tl  w)\  dw  (18) 

Cn (27r)”/^  Ido  y^(n  -  l)L{w)h^-'^{w) 

poo  ^[w^+2wq{w)]  ^ 


’^{y/nt2)dw  >  . 


Jt2-u  x/(n  -  l)L{w)h^'^^{w)  J 

Finally,  we  make  the  same  change  of  variable  (14,15)  as  in  (16)  to  derive  (1). 


104 


References 


[1]  Barndorff'Nielsen,  O.E.  and  Cox,  D.R.  Asymptotic  Techniques  for  Use  in  Statistics,  New 
York:  Chapman  and  Hall,  1989. 

[2]  Bissinger,  B.H.  and  Mutry,  V.N.  “The  Joint  Probability  Density  Function  of  the  Sample 
Mean  and  the  Sample  Maximum,”  Gujarat  Statist.  Rev.,  8,  7-14,  1981. 

[3]  Brickenkamp,  C.E.,  Hasko,  S.,  Natrella,  M.G.  Checking  the  Net  Contents  of  Packaged  Goods, 
NBS  Handbook  133,  Third  Edition,  Gaithersburg,  MD:  National  Bureau  of  Standards, 
1988, 

[4]  Croarkin,  M.C.  and  Yang,  G.L.  “Acceptance  Probabilities  for  a  Sampling  Procedure  Based 
on  the  Mean  and  an  Order  Statistic,”  J.  Res.  Nat.  Bur.  Standards,  87,  485-511,  1987. 

[5]  Stricklett,  K.L.  and  Vangel,  M.G.  “Electric  Motor  Efficiency  Testing  Under  the  New  Part 
431  of  Chapter  II  of  Title  10,  Code  of  Federal  Regulations,”  NIST  Technical  Note  TN  1422. 
Gaithersburg,  MD:  National  Institute  of  Standards  and  Technology,  1997. 


105 


Table  1:  Constants  for  an  Acceptance  Test  Using  the  Mean  and  an  Extremum  (Values  for 
Extremum;  ki) 


a 

Sample  Size  (n) 

2 

“3 

4 

5 

6 

7 

8 

9 

10 

.5 

0.7166 

1.0254 

1.2142 

1.3498 

1.4548 

1.54 

1.6113 

1.6724 

1.7258 

.25 

1.2887 

1.5407 

1.6972 

1.8106 

1.899 

1.9711 

2.0317 

2.0838 

2.1295 

.10 

1.8167 

2.0249 

2.1561 

2.252 

2.3272 

2.3887 

2.4407 

2.4856 

2.525 

.05 

2.1385 

2.3239 

2.442 

2.5286 

2.5967 

2.6527 

2.7 

2.7411 

2.7772 

.025 

2.4208 

2.5888 

2.6965 

2.7758 

2.8384 

2.89 

2.9337 

2.9717 

3.0052 

.01 

2.7526 

2.9027 

2.9997 

3.0715 

3.1283 

3.1753 

3.2153 

3.25 

3.2807 

.005 

2.9805 

3.1198 

3.2103 

3.2775 

3.3309 

3.3751 

3.4127 

3.4455 

3.4745 

.0025 

3.193 

3.3232 

3.4082 

3.4716 

3.522 

3.5638 

3.5995 

3.6307 

3.6582 

.001 

3.4549 

3.5751 

3.6541 

3.7132 

3.7603 

3.7995 

3.8331 

3.8623 

3.8883 

.0005 

3.6412 

3.755 

3.8301 

3.8864 

3.9314 

3.969 

4.0011 

4.0292 

4.0541 

.00025 

3.8188 

3.927 

3.9987 

4.0526 

4.0958 

4.1319 

4.1628 

4.1898 

4.2138 

.0001 

4.0421 

4.1439 

4.2117 

4.2629 

4.304 

4.3384 

4.3678 

4.3936 

4.4166 

.00005 

4.2035 

4.3011 

4.3664 

4.4157 

4.4554 

4.4886 

4.5172 

4.5422 

4.5644 

.000025 

4.3592 

4.453 

4.516 

4.5637 

4.6022 

4.6344 

4.662 

4.6863 

4.7079 

.00001 

4.5573 

4.6466 

4.7069 

4.7527 

4.7897 

4.8206 

4.8473 

4.8707 

4.8915 

Table  2:  Constants  for  an  Acceptance  Test  Using  the  Mean  and  an  Extremum  (Values  for  Mean; 
ko) _ _ 


a 

Sample  Size  (n) 

2 

3 

4 

5 

6 

7 

8 

9 

10 

.5 

0.1472 

0.1591 

0.1539 

0.1473 

0.141 

0.1354 

0.1303 

0.1258 

0,1217 

.25 

0.6266 

0.5421 

0.4818 

0.4382 

0.4048 

0.3782 

0.3563 

0.3379 

0.3221 

.10 

1.0539 

0.8836 

0.7744 

0.6978 

0.6403 

0.5951 

0.5583 

0.5276 

0.5016 

.05 

1.3076 

1.0868 

0.9486 

0.8525 

0.7808 

0.7246 

0.679 

0,6411 

0.6089 

.025 

1.5266 

1.2626 

1.0995 

0.9866 

0.9026 

0.8369 

0.7838 

0.7396 

0.7022 

.01 

1.7804 

1.4666 

1.2747 

1.1425 

1.0443 

0.9678 

0.9059 

0.8545 

0.811 

.005 

1.9528 

1.6054 

1.3941 

1.2488 

1.1411 

1.0571 

0.9893 

0.933 

0.8854 

.0025 

2.1123 

1.7341 

1.5049 

1.3475 

1.2309 

1.1401 

1.0668 

1.0061 

0.9546 

.001 

2.3076 

1.8919 

1.6408 

1.4687 

1.3413 

1.2422 

1.1622 

1.0959 

1.0397 

.0005 

2.4457 

2.0035 

1.7371 

1.5546 

1.4196 

1.3145 

1.2298 

1.1596 

1.1002 

.00025 

2.5768 

2.1097 

1.8287 

1.6363 

1.4941 

1.3835 

1.2943 

1.2203 

1.1578 

.0001 

2.7411 

2.2429 

1.9436 

1.739 

1.5877 

1.4701 

1.3752 

1.2966 

1.2301 

.00005 

2.8595 

2.3389 

2.0266 

1.813 

1.6553 

1.5326 

1.4337 

1.3517 

1.2824 

.000025 

2.9734 

2.4313 

2.1065 

1.8844 

•  1.7204 

1.5928 

1.49 

1.4048 

1.3327 

.00001 

3.1179 

2.5487 

2.2079 

1.9751 

1.8031 

1.6694 

1.5616 

1.4723 

1.3968 
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Ratio  of  S.D.s:  Null/Alternative 
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Strength 

120  130  140  150  160 


Figure  4:  Example  lot  acceptance  chart  for  compressive  strength  of  a  composite  material  (n  = 
5,  a  =  .001).  The  first  four  lots  were  used  to  set  up  the  limits. 
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APPROXIMATE  QUANTILES  FOR  THE  MULTIVARIATE  STUDENTIZED  RANGE 
IN  THE  CASE  OF  THREE  UNEQUAL  GROUPS^ 

Otto  Schwalb  and  James  R.  Thompson^ 

Department  of  Statistics 
Rice  University 
6100  Main  Street  MS-138 
Houston,  TX  77005 


ABSTRACT 

In  this  paper,  we  develop  equations  which  provide  approximate  quantiles  for  the  multivariate  studen- 
tized  range  in  the  case  of  three  unequal  groups.  These  equations  are  ready  to  use  and  are  available  via 
a  website.  In  the  equal  groups  case,  our  approximations  are  more  accurate  than  existing  approximations 
by  a  factor  of  ten.  In  the  unequal  groups  case,  there  are  no  direct  competitors  to  our  approximations. 

INTRODUCTION 

The  theory  of  multiple  comparisons  has  a  rich  history  in  statistics.  There  is  usually  no  debate  about 
which  multiple  comparisons  procedure  is  best  for  a  given  situation.  In  the  multivariate  case,  for  exam¬ 
ple,  it  is  known  that  the  T^ax  procedure  (the  simplest  case  of  which  is  originally  due  to  Roy  (1953) 
and  generalizations  of  which  are  attributed  to  Krishnaiah  (1969))  will  always  outperform  the  equivari- 
ant  procedures  (among  which  are  the  procedures  based  upon  Roy’s  root,  the  Pillai-Bartlett  trace,  the 
Hotelling-Lawley  trace,  and  Wilks’  Lambda)  in  cases  where  there  are  a  fixed  number  of  contrasts  speci¬ 
fied  in  advance  and  either  confidence  intervals  or  confidence  regions  (for  particular  parametric  families) 
are  desired.  This  claim  is  strong  indeed  and  is  perhaps  not  widely  known,  but  it  has  been  established  for 
quite  some  time  now.  It  requires  more  effort  to  demonstrate  than  just  citing  a  few  references  here  in  the 
introduction  (and  some  of  the  results  required  are  fairly  obscure),  and  so  will  be  discussed  below.  That 
the  T^ax  procedure  possesses  such  outstanding  properties  is  completely  analogous  to  the  relationship 
between  Scheffe’s  procedure  and  the  studentized  range  procedure  in  ANOVA.  The  reason  that  the 
procedure  has  not  received  much  attention  seems  to  be  due  to  the  problems  associated  with  obtaining 
its  distribution  in  closed  form.  Siotani  (1992)  is  one  of  the  major  investigators  of  the  procedure. 
His  breakthrough  in  providing  an  equation  for  approximate  quantiles  for  the  general  case  of  equal  groups 
(after  the  closed  form  solution  has  remained  an  iinsolved  problem  for  nearly  40  years)  is  remarkable 
and  should  hopefully  serve  to  bring  renewed  attention  to  a  procedure  which  is  so  useful  and  easy  to 
understand. 

In  this  paper,  we  develop  equations  which  provide  approximate  quantiles  for  the  multivariate  stu¬ 
dentized  range  in  the  case  of  one-way  M ANOVA  with  three  unequal  groups.  The  equations  apply  to  the 
following  situations: 

•  quantiles:  0.8,  0.825,  0.85,  .. 0.925,  0.95,  0.975,  0.99  (e.g.  for  a  significance  level  of  a  =  0.1,  we 
desire  the  1  -  a  =  0.9  quantile) 

•  dimensions:  1,  2,  3,  4,  5 

•  group  sizes:  arbitrary  group  sizes  between  4  and  100  (subject  to  the  restriction  that  the  largest 
group  be  no  more  than  4  times  the  size  of  the  smallest  group). 
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allows  the  interested  reader  easy  access  to  these  quantiles.  The  importance  of  these  approximate  quantiles 
is  that,  until  now,  no  quantiles  have  been  available  for  the  procedure  in  the  case  of  unequal  groups.  Our 
approximations  also  apply  to  the  case  of  equal  groups,  and,  for  the  equal  groups  cases  we  consider,  the 
accuracy  of  our  approximations  improves  upon  the  accuracy  reported  by  others  by  a  factor  of  ten. 

Section  2  treats  the  theory  associated  with  the  procedure  in  the  general  case,  as  well  as  the 
special  cases  of  two  and  three  groups  in  1-way  MANOVA.  Section  2  also  demonstrates  the  claims  made 
above  concerning  the  performance  of  the  procedure  with  respect  to  the  equivariant  procedures. 
In  section  3,  the  practical  utility  of  the  procedure  is  demonstrated  with  an  example  using  the  Fisher- 
Anderson  Iris  data  (1939).  The  example  illustrates  how  the  procedure  is  a  strongly  indicated  choice 
in  many  realistic  situations.  Section  4  discusses  the  development  of  our  equations.  Most  of  the  details  are 
omitted  from  this  section  for  the  sake  of  brevity,  but  may  be  obtained  from  the  first  author.  The  details 
are  straightforward,  however. 


THEORY 


OVERVIEW 


The  derivation  of  the  procedure  to  follow  is  rather  general,  that  is  to  say,  very  little  of  the  theory 
to  follow  is  specific  to  the  1-way  case  or  the  case  of  all  pairwise  comparisons.  Parts  of  what  follow  can  be 
found  scattered  throughout  the  literature,  but  the  results  are  certainly  not  easy  to  find.  For  example,  the 
results  we  synthesize  on  general  confidence  regions  are  very  obscure.  They  are  certainly  beyond  the  reach 
of  anyone  who  simply  needs  to  use  the  procedures  in  an  applied  situation,  but  might  have  very  little 
time  to  search  the  literature.  Hopefully,  the  presentation  here  will  make  these  results  more  accessible  to 
a  wider  audience. 

The  MANOVA  setup  starts  with  a  matrix  Y  which  is  n  x  p  and  whose  rows  are  each  distributed 
independently  as  p- variate  normal  vectors  with  common  unknown  covariance  matrix  E.  It  is  further 
assumed  that  E{Y)  =  X(3  where  X  is  a  design  matrix  of  order  n  x  I  and  /3  is  the  I  x  p  matrix  of 
unknown  parameters.  We  assume  that  X  has  full  column  rank.  We  first  define  Roy’s  root  and  the  related 
multivariate  test  statistics.  We  wish  to  test  the  hypothesis  ifo  :  CP  =  0  where  C  is  r  x  /  and  has  full  row 
rank.  Define  the  hypothesis  and  error  matrices  (respectively)  as 

i/ =  and  £?  =  (/ -  X(X^X)“^X^)  Y 

where  p  =  (X^X)-“iX^Y  and  W  =  C{X'^ X)-^C'^ ,  Let  the  eigenvalues  of  HE-^  be  denoted  by 

^  ^2  ^  >  Ap.  Then  Roy’s  root  is  Ai,  the  Hotelling-Lawley  trace  is  X)?=ri  Ai,  the  Pillai- Bartlett 

trace  is  Ai/(H-Ai),  and  Wilks’  lambda  is  nf=i  1/(1  + A^),  These  statistics  can  be  defined  in  similar 
ways  using  the  eigenvalues  of  iJ(£^-j-H')“^,  which  tends  to  make  the  notation  in  the  literature  sometimes 
confusing.  In  fact,  Roy’s  root  seems  to  be  defined  in  different  ways  depending  on  the  author.  We  follow 
Wijsman  (1979)  and  Kres  (1983)  in  defining  Roy’s  root  as  Ai.  For  convenience,  we  will  subsequently  refer 
to  these  four  statistics  as  the  equivariant  statistics. 

For  the  procedure,  however,  the  hypothesis  is  stated  in  a  different  form  as  Ho  :  Lp  =  0, 

where  L  is  not  required  to  have  full  row  rank.  The  matrix  LP  is  an  explicit  representation  of  each 
and  every  contrast  of  interest  for  the  problem  at  hand.  Since  anything  of  the  form  a^CP,  Va  E  3?’*,  is 
protected  by  procedures  based  on  the  equivariant  statistics,  the  family  of  linear  combinations  considered 
by  those  procedures  is  vast  in  comparison  with  the  family  of  linear  combinations  considered  by  the 
procedure.  It  should  be  apparent  that  the  null  hypothesis  considered  by  the  T^ax  procedure  and  the 
null  hypothesis  for  the  classical  formulation  are  not  equivalent.  This  might  seem  confusing,  however,  the 
problem  of  multiple  comparisons  deals  with  obtaining  the  best  (usually  in  the  sense  of  the  smallest)  set 
of  confidence  regions  for  a  given  parametric  family.  Within  the  classical  framework  then,  we  start  with 
the  hypothesis  Ho  :  Cp  =  0.  Once  a  parametric  family  is  specified  (e.g.  oFCP^  Va  E  3C’’),  the  discussion 
of  competing  multiple  comparison  procedures  follows  next.  Such  a  discussion  cannot  be  carried  out  prior 
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to  specifying  the  parametric  family  of  interest  (see  Wijsman  (1979)  for  instance).  With  these  subtleties 
in  mind,  the  T^ax  procedure  tests  the  null  hypothesis  that  Hq  :  0  vs  Hi  :  Lp  0  where  Lis  ditxl 

matrix  of  contrasts.  For  example,  in  the  case  of  1-way  MANOVA  with  3  groups,  I  =  3  (the  number  of 
groups)  and  t=  3  (the  number  of  pairwise  comparisons).  Applying  a  variation  on  the  union-intersection 
approach  (originally  due  to  Roy  (1953)),  one  sees  that: 

t 

{Ho  :L0  =  O}^  1^  =  0} 

2=1 

where  L[i^  j  represents  the  itb  row  of  the  contrast  matrix  L  (e.g.  in  the  case  of  1-way  MANOVA  with  3 
groups  under  the  “cell  means”  formulation  of  the  design  matrix  X,  j  =  (1,  -1,  0)  compares  group  1 
and  group  2).  Define  df  :=  L[i^  ]/3.  As  usual,  the  least  squares  estimators  P  are  a  linear  combination  of  Y. 
By  the  Gauss-Markov  Theorem,  the  minimum  variance  unbiased  estimators  for  the  Jj’s  are  Sf  =  L[i^  ]p 
which  are  also  linear  combinations  of  V.  It  is  easily  seen  that 

Si  ^  Np(^Si^Wii^^ 

where  wu  is  the  zth  diagonal  element  of  i(X^X)“'^L^.  Let  =  Si/y/wH  and  5^  =  hls/wu.  Then 

Sl^Np{8lY) 

so  that 

is  distributed  as  Hotelling’s  where  E  is  the  error  matrix  (defined  above).  The  statistic  is  then 
defined  as 


rp2 

•^max 


max  Tf 
l<i<t 


The  procedure  rejects  Hq  if  any  of  the  t  Hotelling  exceed  T^^2,(a),  where  denotes  the  1  —  a 

quantile  of  the  distribution  of  value  such  that  <  T^axi^))  =  1  -  a.  As  will  be 

discussed  immediately,  each  which  exceeds  (a)  is  flagged  as  a  significant  contrast.  It  should  be 
noted  here  that  we  will  speak  of  the  comparison  of  Tf  with  Tf^^3,(a),  but  our  equations  actually  give 
values  for  Tmaxiot)  due  to  the  conventions  which  are  already  established  for  the  univariate  studentized 
range,  q{a),  and  the  equal-groups  multivariate  studentized  range,  Rmax{o)*  Thus,  in  practice,  one  would 
need  to  take  square  roots  of  the  Tf ’s  before  comparing  them  to  Tmaxio^)- 

The  T^^^  procedure  has  the  intuitive  appeal  that  it  rejects  iff  Tf  >  Tf^^2,(a)  for  some  z,  z  =  1, . . . ,  t, 
which  occurs  iff  at  leetst  one  of  the  p-dimensional  confidence  ellipsoids  does  not  contain  the  0-vector. 
That  this  follows  is  a  simple  consequence  of  the  fact  that  T^axi^)  is  chosen  to  satisfy 


which  is  the  same  as  writing 

Beyond  simply  rejecting,  however,  each  Tf  which  exceeds  T^^2.(a)  will  be  flagged  as  significant.  For 
example,  if  ^  =  3  and  we  are  given  data  and  find  that  Tf  and  Tf  exceed  Tf^^3,(a)  but  Tf  does  not,  then 
we  would  flag  S*  and  <5|  as  the  culprits.  The  practical  interpretation  of  this  is  subject  to  the  limitations 
of  any  confidence  statement  as  follows.  If  repeated  samples  are  taken  and  the  3  ellipsoids  are  constructed 
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each  time,  then  (1  -  a)  x  100%  of  the  time  all  3  of  these  ellipsoids  will  contain  the  true  i  =  1,2,3. 
Equivalently,  (1  —  a)  x  100%  of  the  time,  all  3  of  the  Tf's  will  reflect  the  location  of  the  true 
i  =  1,2,3,  as  indicated  by  their  relationship  to  the  T^^^{a)  quantile.  For  example,  if  and  ^2  were 
truly  non  zero,  but  Jg  was  zero,  then,  under  repeated  sampling,  (1  —  a)  x  100%  of  the  time  we  would 
find  that  Tf  >  <^nd  Ti  < 

In  addition  to  the  procedure  described  above,  one  can  also  construct  confidence  intervals  for  all 
pairwise  contrasts.  Specifically,  it  can  be  shown  -  see  Krishnaiah  (1969,  p  126)  -  that  confidence  intervals 
for  the  procedure  are  given  by 

c^5i  e  c'^6i  ±  c'l'EcwuHn  - 1)  ,  'iceW 

Confidence  intervals  for  the  pairwise  differences  on  each  variable  can  be  obtained  by  substituting  cj  = 
(0, *‘*,0,1,0, ••*,0),  where  the  1  is  in  the  jth.  position,  for  c  in  the  previous  equation.  This  leads  to 
confidence  intervals  of  the  form 

^ij  €  Sij  ±  “'«/ («  -  0  .  *  ”  i’  'p  (1) 

where  Sij  is  the  jth  component  of  Si  and  ejj  is  the  jth  diagonal  element  of  E.  Notice  in  constructing 
these  intervals  that  whatever  is  to  be  gained  over  a  l-d  battery  of  univariate  procedures  (the  expression 
“1-c?  battery”  will  be  described  below  in  the  example  using  the  Iris  data)  depends  completely  upon  the 
distribution  of  J^ax*  follows  since  ejj  is  just  the  univariate  SSE  on  the  jth  dependent  variable. 
The  covariance  structure  within  the  data  is  largely  reflected  in  the  off-diagonal  elements  of  E,  and  the 
off-diagonal  elements  completely  dominate  the  tilting  of  the  p-dimensional  ellipsoids.  But  the  off-diagonal 
elements  are  not  used  in  (1),  hence  important  information  is  being  ignored  if  one  looks  at  confidence 
intervals  (along  the  dimensions  of  the  variables  as  given)  alone.  This  particular  issue  lies  at  the  heart  of 
the  example  we  give  involving  the  Iris  data  (below).  We  will  revisit  it  once  we  get  there. 

CONFIDENCE  INTERVALS 

For  the  case  of  a  fixed  number  of  contrasts  specified  in  advance,  Krishnaiah  (1969,  p.  131)  demon¬ 
strates  that  the  T^^a,  confidence  intervals  are  smaller  than  those  for  Roy’s  root  (analogous  to  the  ellipsoid 
discussion  below,  the  T^ax  confidence  intervals  provide  protection  for  the  parametric  family  of  af  Ci3b, 
for  the  pre-specified  Sf”  vectors  and  V&  €  In  turn,  Wijsman  (1979)  demonstrates  that 

the  confidence  intervals  based  upon  Roy’s  root  are  the  smallest  among  the  class  of  all  equivariant  pro¬ 
cedures  which  provide  protection  for  the  parametric  family  V  a  €  3?’'  and  V  5  G  5R^.  This  class 

of  equivariant  procedures  includes  those  based  on  Wilks’  Lambda,  the  Hotelling-Lawley  trace,  and  the 
Pillai-Bartlett  trace.  Hence  the  confidence  intervals  based  upon  ^.re  always  shorter  than  those 

based  upon  these  equivariant  procedures  when  the  number  of  contrasts  considered  is  fixed  and  specified 
in  advance. 

CONFIDENCE  REGIONS 

We  start  this  section  by  stating  (with  a  reference)  why  the  T^^x  ellipsoids  are  smaller  than  the  el¬ 
lipsoids  based  upon  Roy’s  root.  We  also  illustrate  the  relationship  between  Roy’s  root  and  which 
dramatically  emphasizes  the  conservative  nature  of  Roy’s  procedure.  After  that,  we  cite  the  results  which 
show  that  the  Roy’s  root  ellipsoids  are  the  smallest  (for  all  practical  purposes)  among  the  equivariant 
class  of  MANOVA  procedures  which  provide  protection  for  the  parametric  family  Va  G  3?'*.  The 

conclusion  then  is  that  the  T^ax  ellipsoids  will  always  be  smaller  than  ellipsoids  based  on  equivariant 
MANOVA  procedures  in  all  practical  cases  (provided  we  make  additional  mild  restrictions  to  the  equiv¬ 
ariant  class  under  consideration;  one  such  sufficient  condition  is  that  the  confidence  regions  in  p  space 
be  convex). 

The  fact  that  the  ellipsoids  (for  a  fixed  number  of  contrasts  specified  in  advance)  for  the  T^^x 
procedure  are  always  smaller  than  the  ellipsoids  based  on  Roy’s  root  is  an  immediate  consequence  of  the 


fact  that  T^„^(a)/(  71  —  1)  <  \a  (see  Krishnaiah  (1969,  p.  131)).  In  the  presentation  below,  we  illustrate 
the  conservative  nature  of  the  equivariant  procedures  via  a  reexpression  of  the  Roy’s  root  ellipsoidal 
simultaneous  confidence  regions.  For  clarity  here,  we  are  using  Aa  to  denote  the  1-  a  quantile  of  Roy’s 
root  (i.e.  P(Ai  <  A^)  =  1  -  a  where  we  defined  Ai  at  the  beginning  of  the  THEORY  section);  our  A^ 
coincides  with  the  Aa  of  Wijsman  (1979)  and  Kres  (1983). 

The  ellipsoids  for  Roy’s  root  are  given  by  (see  Wijsman  (1979),  equation  (2.4)) 

{a'^Cp-a^Cp)E-'^(a'^CP-a'^Cl3f  <\aa'^C{X^X)~^C^a  ,  \/aeW  (2) 

where  the  C  matrix  is  the  full  rank  version  of  the  contrast  matrix  for  Hq  as  described  previously,  and  r 
is  the  rank  of  Hq  (for  the  1-way  case  r  =  I  —  1).  Now  let  A*  =  Aa(n  —  Z).  Rewrite  (2)  as 

which  is  equivalent  to 

sup  Tl<\*^  , 


where 


■  a^C'(X^X)-iC'^a  a^C0)'^  . 

We  have  shown  in  a  different  section  that  the  T^ax  ellipsoids  are  given  by 

which  after  some  rearranging  can  be  written 

"aj'CiX^i-^C^a  ~  (;r^)  i  =  (3) 

where  aj  is  a  row  vector  in  3?^  chosen  so  that  aJC  ^  L[i^  j  (it  is  always  possible  to  choose  such  an  af , 
since  given  L,  one  can  choose  C7  to  be  a  matrix  such  that  all  the  rows  of  L  are  in  the  span  of  C  and  C 
has  full  row  rank).  Rewrite  (3)  as 


max 

a6{ai»a2,... 


(a) 


where 


gn2  _ _ 


afC{X^X)-^C^ai 


(aJCP-afCf3) 


-1 


{aJC$-aJC0f 


And,  expressed  in  this  fashion,  it  becomes  obvious  that 


rp2  _ 

-^max 


max  Ta 

a€{ai,a2,...,at } 


<  sup 


(4) 


The  relationship  as  expressed  in  (4)  clearly  shows  that  the  T^ax  procedure  will  always  outperform 
Roy’s  procedure  for  the  situation  of  a  fixed  number  of  contrasts  specified  in  advance.  It  also  emphasizes 
dramatically  the  conservative  nature  of  the  equivariant  procedures,  particularly  when  the  number  of 
contrasts  t  is  very  small. 
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Finally,  Wijsman  (1979)  demonstrates  that  the  simultaneous  ellipsoidal  confidence  regions  based  upon 
Roy’s  root  are  the  smallest  among  all  equivariant  procedures  (which  provide  protection  for  the  family 
Va  e  when  p  <  Z  —  r,  where  I  is  the  space  spanned  by  the  columns  of  X  and  r  is  the  number 
of  constraints  in  the  null  hypothesis.  For  the  case  of  1-way  MANOVA  with  3  groups,  for  example,  /  =  3 
and  r  =  2,  since  the  hypothesis  testing  in  this  case  essentially  asks  whether  the  3  p-variate  location 
vectors  for  the  groups  can  be  adequately  described  by  a  single  p-variate  location  vector  plus  noise  (see 
the  canonical  form  of  the  MANOVA  model  in  Lehmann  (1986)).  Wijsman  (1979)  demonstrates  further 
that  the  Roy’s  root  ellipsoids  are  also  the  smallest  in  the  equivariant  class  of  procedures  in  the  case  where 
p  >  /  —  r,  provided  some  very  mild  conditions  are  imposed  upon  the  shapes  of  the  confidence  regions 
desired.  Requiring  the  confidence  regions  to  be  convex  is  one  example  of  such  a  mild  condition.  Another 
example  is  requiring  the  confidence  regions  to  be  connected  and  contain  their  “center”  (i.e.  contain  the 
point  estimate  for  the  mean  vector).  From  an  applied  point  of  view,  it  would  be  difficult  to  characterize 
these  additional  conditions  as  restrictions  at  all,  and  it  would  be  fair  to  say  that  Wijsman’s  results  bring 
a  rich  history  of  discussion  on  the  subject  to  a  close. 

THE  CASE  OF  TWO  GROUPS 


When  there  are  two  groups,  there  is  only  one  contrast  to  consider.  Let  ni  be  the  sample  size  of  group 
i  and  n  =  ni  -f  n2-  Here  =  7iin2/(ni  -h  n2)(xi  “  X2)^{n  —  2)E^^{xi  -  ^2),  which  one  may  easily 
verify  is  the  2-sample  Hotelling’s  T^.  Hence,  r^^3,(a,n,p)  =  p(n  —  2)/(n  —  p  -  l)Fp,„_p_i(a)  and  the 
quantiles  are  available  in  closed  form.  If  T^ax  >  procedure  rejects. 


THE  CASE  OF  THREE  GROUPS 


In  the  case  of  three  groups,  there  are  3  contrasts,  and  the  distribution  is  no  longer  available  in  closed 
form  although  there  has  been  a  great  deal  of  effort  in  that  direction  (see  Siotani  (1959)  and  (1992)). 
Using  the  “cell  means”  formulation  of  the  design  one  is  led  to  (X^X)“*^  =  (izop(l/ni,  l/n2,  l/ns)  and 
the  contrast  matrix 


L 


-1 

0 

1 


0 

-1 

-1 


which  leads  to 


L{X'^Xr^L^  = 


/  1/ni  + 1/712 
1/ni 
\  -l/n2 


1/ni 

l/n\  + 1/713 
l/ns 


-1/712 

l/ns 

1/712  +  l/ns 


50  in  this  simple  case,  Tt/n  =  1/ni  -h  1/712,  w^22  =  l/'^^i  +  l/nz,  and  wzz  =  1/712  +  1/713.  Also,  61  =  pi  — ^2, 
(J2  =  Pi  -  p3,  and  (^3  =  P2  —  /^3-  It  is  also  clear  (this  is  similar  to  the  general  case  already  discussed)  that 

51  =  (xi  -X2)  r\j  Np{Si,‘Wn'S).  Rewrite  this  as  ~  Np{Sl,'E)  (where  :=  Si/y/WH).  This  yields 


-1 


(Sl-5*,)  =  —ix,-X2) 

Wii 


(A) 


~1 


{xi  -  X2) 


Similarly  for  Tl  and  Tf .  Then  T^ax  j^^t  the  largest  of  these  3  statistics.  The  procedure  rejects,  then, 
if  any  of  the  three  Zf’s  exceed  T^ax(^5^i5^2,?^3,p)*  Additionally  (as  explained  above)  each  Tf  which 
exceeds  the  quantile  is  flagged  as  significant.  As  a  point  of  reference  it  is  helpful  to  note  that  if 

the  group  sizes  are  equal,  then 


T^max 


(a, Til, Til, 711,1)  =  ^  $(a,3,7i-3) 
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Figure  1:  Plotted  are  the  fractions  of  the  time  (out  of  10,000  resamples)  when  the  given  procedure  found 
the  (shifted)  versicolor-virginica  difference. 


where  q{a,Z,n  —  3)  is  the  1  —  a  quantile  of  the  univariate  studentized  range  with  3  groups  and  n  —  Z 
error  degrees  of  freedom  (n  :=  ni  +  ni  -f-  ni).  And  also  for  the  case  of  equal  group  sizes,  it  is  helpful  to 
note  that 


c(a,ni,ni,ni,p)  =  ^  RMAx{oc,7n,i/,p) 


where  Rmax  is  the  multivariate  studentized  range  as  defined  in  Siotani  (1992)  with  parameters  m  and 
I/.  Under  Siotani’s  definition  —  3  and  m  =  3. 


PRACTICAL  UTILITY  OF  THE  PROCEDURE 

It  is  a  characteristic  of  the  MANOVA  problem  that  no  single  procedure  is  going  to  be  the  best 
for  all  situations.  With  this  in  mind,  it  seems  appropriate  to  demonstrate  the  practical  utility  of  the 
'Rmax  procedure  by  comparing  its  performance  with  a  reasonable  competitor.  The  fact  that  the 
procedure  will  always  outperform  Roy’s  root  and,  in  turn,  the  other  equivariant  test  statistics  (for  the 
all  pairwise  comparisons  situation)  clearly  rules  out  procedures  based  on  those  statistics  as  competition. 
It  is  clear  that  a  battery  of  univariate  procedures  protected  against  type  I  inflation  via  the  Bonferroni 
method  is  a  reasonable  competitor.  The  studentized  range  procedure  is  known  to  provide  the  shortest 
confidence  intervals  (of  the  available  procedures)  for  all  pairwise  comparisons  for  equal  groups  in  the 
imivariate  case,  and  it  is  a  remarkable  fact  that  the  procedure  is  conservative  when  the  groups  are  not 
equal.  Thus  a  battery  (across  dimensions)  of  studentized  range  procedures  will  be  used.  It  should  be 
noted  in  passing  that  there  are  situations  where  the  studentized  range  1-d  battery  will  find  differences 
when  the  T^ax  procedure  does  not,  and  vice  versa.  It  should  also  be  pointed  out  here  that  for  all  the 
cases  covered  by  our  approximations  for  confidence  intervals  based  on  the  studentized  range  1-d 

battery  will  always  be  shorter  than  confidence  intervals  based  on  T^ax-  Nonetheless,  by  comparing  each 
Tf  with  T^axi^)  2Lre  many  situations  where  the  procedure  will  find  group  differences  when 
the  Bonferroni-studentized  range  procedure  will  not,  which  this  example  will  show. 

We  consider  the  Fisher- Anderson  Iris  data  (1939).  This  data  set  consists  of  3  groups  with  4  dimensions. 
There  is  no  problem  in  declaring  the  setosa  group  to  be  different  from  the  versicolor  and  virginica  groups. 
It  may  be  surprising  that  the  1-d  studentized  range  battery  also  finds  the  versicolor-virginica  difference 
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Figure  2:  Histogram  of  all  4500  standard  deviations  for  the  validation  runs.  Overlaid  on  the  plot  are  the  largest 
18  standard  deviations. 


quite  easily,  even  at  very  high  levels  of  significance.  However,  marginal  plots  of  petal  length  and  petal 
width  clearly  show  these  two  groups  to  be  different.  Shifting  the  versicolor  group  closer  to  the  virginica 
group  provides  a  more  interesting  example  for  the  procedures  at  hand.  We  shift  the  versicolor  group  by 
adding  (1, 0, 1.5, 0.5)  to  each  observation  where  the  coordinates  are  (sepal  1.,  sepal  w.,  petal  1.,  petal  w.). 
It  might  be  even  more  surprising  that  the  l-d  battery  still  finds  the  versicolor-virginica  difference  to  be 
highly  significant. 

Now  suppose  we  took  random  samples  without  replacement  of  size  no  <  50  (say  no  =  45,  for  now)  from 
each  group  (we  stick  with  equal  samples  since  this  is  where  the  univariate  studentized  range  procedure 
is  the  most  powerful)  and  each  time  applied  (1)  the  1-d  battery  and  (2)  the  procedure.  We  can 
record  whether  each  procedure  finds  the  versicolor-virginica  difference  for  each  sample.  We  do  this  some 
large  number  of  times.  We  can  then  repeat  the  whole  setup  for  no  =  40, 35, 30, . . .,  etc.  to  get  some  idea 
of  how  the  procedures  perform  as  we  decrease  the  sample  size. 

We  carry  this  out  10,000  times  at  each  value  of  no  and  summarize  the  results  in  Figure  1.  Even 
with  only  a  subsample  of  25  points  from  each  group,  the  procedure  suffers  practically  no  loss 
in  performance,  declaring  the  versicolor-virginica  difference  about  98%  of  the  time  using  the  a  =  0.01 
critical  value.  On  the  other  hand,  the  1-d  battery  declares  the  versicolor-virginica  difference  only  about 
56%  of  the  time  using  its  (conservative)  a  =  0.01  critical  value  for  subsamples  of  size  25  from  each 
group.  A  rotating  3-d  plot  of  the  shifted  data  in  software  such  as  Xgobi  illustrates  dramatically  how  it 
is  the  covariance  structure  which  separates  the  groups.  The  battery  of  1-d  tests  is  essentially  blind  to 
this  type  of  multivariate  structure.  In  all  fairness,  the  procedure  would  be  blind  in  the  case  where 
the  difference  in  the  groups  is  small  along  a  single  dimension  and  many  redundant  dimensions  (which 
contribute  no  help  through  their  covariance  structure  -  independence  as  one  example)  with  no  differences 
are  included.  Nevertheless,  this  example  using  the  Iris  data  demonstrates  the  practical  utility  of  the 
procedure. 

Returning  to  an  issue  mentioned  earlier,  what  lies  at  the  heart  of  this  example  is  the  fact  that 

^  g(a/4,3,n-3)  <  '^max  {a,ni,ni,ni,4) 

so  that  the  intervals  in  (1)  are  always  smaller  for  the  1-d  battery  of  studentized  range  procedures  than 
they  are  for  the  procedure.  Hence,  the  only  way  to  outperform  the  1-d  battery  is  by  taking  advantage 
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of  the  multivariate  structure  as  we  have  done  here  (i,e.  in  comparing  the  T^^’s  to  we  are  using 

the  multivariate  information  which  is  contained  in  the  ofF-diagonals  of  E),  As  mentioned  above,  for  all 
the  cases  our  equations  cover, 


-^q{a/p,  3,  n  -  3)  <  Tmax (a,  ni ,  ,  ni ,  p) 


so  that  the  1-d  studentized  range  battery  always  leads  to  shorter  confidence  intervals  than  This 

leads  us  to  conjecture  that  in  the  1-way  case,  the  l-d  studentized  range  battery  will  always  lead  to  shorter 
confidence  intervals  than  T^^x*  This  conjecture  is  clearly  of  great  relevance  from  a  practical  standpoint. 


DEVELOPING  THE  EQUATIONS 

The  equations  we  developed  are  based  upon  Monte  Carlo  simulation  of  the  quantiles  of  would 

have  been  possible  of  course  to  provide  the  reader  with  FORTRAN  code  which  performs  simulations  of 
in  a  very  general  setting  (e.g.  for  an  arbitrary  design  X  and  an  arbitrary  contrast  matrix  L), 
However,  it  was  decided  that  such  a  code  would  not  be  so  widely  used  due  to  the  fact  that  it  requires  the 
user  to  wait  for  a  simulated  quantile  (the  waiting  time  could  be  as  much  as  an  hour  or  so).  To  overcome 
this  liability,  it  was  decided  to  provide  approximate  quantiles  in  some  format  which  does  not  require  the 
user  to  wait.  One  immediately  thinks  of  tables  (of  approximate  quantiles)  as  a  potential  such  format. 
One  prohibitive  difficulty  of  providing  tables  in  a  very  general  setting  is  that  the  completely  general  T^ax 
formulation  is  almost  as  general  as  the  (multivariate)  linear  model.  To  make  any  headway  at  all  in  a 
table-type  direction,  one  must  consider  particular  cases  of  the  general  formulation  of  T^^x* 

The  simplest  case  of  T^^x  one-way  MANOVA.  For  the  unequal  groups  case,  however, 

one-way  MANOVA  is  still  too  general,  and  one  must  restrict  the  generality  even  further  to  make  any 
progress.  Within  one-way  MANOVA,  the  natural  place  to  start  is  3  unequal  groups,  since  quantiles  for 
2  unequal  groups  are  available  in  closed  form  (this  was  described  above).  For  3  groups  of  unequal  sizes, 
there  are  really  too  many  combinations  of  the  group  sizes  to  be  able  to  construct  a  table  (of  approximate 
quantiles)  within  any  reasonable  number  of  pages.  Based  on  this  difficulty,  it  was  decided  that  we  should 
offer  the  user  a  set  of  equations  rather  than  a  lengthy  table.  Not  only  can  the  equations  provide  a  more 
concise  summary  of  the  approximate  quantiles,  but  they  can  also  interpolate  (approximate)  quantiles  at 
which  no  simulations  have  been  performed.  Further,  the  user  is  not  required  to  wait  at  all.  Thus,  equations 
for  approximate  quantiles  of  T^ax  seemed  the  appropriate  format;  this  is  the  format  we  provide  for  the 
case  of  3  unequal  groups. 

The  equations  were  developed  as  follows:  {a)  we  chose  a  (large)  grid,  say  Q,  of  (ni, 712,723)  triples  at 
which  to  perform  simulations  of  T^ax  (^)  simulating  quantiles  at  each  point  of  we  fit  (inter¬ 
polating)  equations  to  these  simulated  quantiles.  We  will  refer  to  the  simulations  used  in  constructing 
these  equations  as  the  model-huilding  simulations.  The  fitting  of  these  equations  is  described  in  the  first 
subsection  (ABRIDGED  SUMMARY  OF  THE  FITTING  PROCESS)  below. 

To  obtain  some  idea  of  the  accuracy  of  our  equations,  we  performed  additional  simulations.  These 
additional  simulations  will  be  referred  to  as  the  validation  runs.  In  the  validation  runs,  we  performed  a 
much  larger  number  of  simulations  over  a  much  smaller  subset  of  points  of  Q.  This  is  described  in  the 
second  subsection  (ASSESSING  THE  ACCURACY  OF  THE  APPROXIMATIONS)  below. 

ABRIDGED  SUMMARY  OF  THE  FITTING  PROCESS 

It  was  necessary  to  restrict  the  approximations  to  the  region  {a)  4  <  rii  <  100,  i  =  1,2,3,  and  (6) 
maxi^{i^2,3}(^i)  £  4minj^{i^2,3}(^i)  order  to  reduce  the  number  of  (721,722,713)  triples  considered  to 
something  manageable.  Even  with  this  reduction,  9357  (711,712,713)  triples  (about  5,000  distinct  points) 
were  used  in  the  model-building  simulations. 

For  a  given  dimension  (recall  that  dimension  corresponds  to  p),  at  least  100, 000  simulations  per  point 
(or  triple)  were  used,  more  simulations  being  required  for  the  higher  dimensions  .  Using  an  adaptive 
number  of  simulations  was  attempted  but  proved  to  be  too  slow  and  unstable.  Thus  the  number  of 
simulations  per  point  had  to  be  fixed  for  each  dimension. 
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Figure  3:  Histogram  of  the  (absolute  values)  of  all  4500  of  the  fitted  values  -  the  *‘exact”  values.  Overlaid  on 
the  plot  are  the  14  largest  absolute  errors. 


It  became  clear  from  the  inspection  of  many  scatter  plots  that  the  function  might  be  well  approximated 
by  a  linear  combination  of  terms  like  l/ni,  l/n2,  etc.  Least  squares  was  used  to  fit  equations  for  each 
quantile  and  each  value  of  p.  Thus  45  equations  were  fit  (5  dimensions:  p  =  1,...,5,  and  9  quantiles: 
0.8, 0.825, 0.85, . . . ,  0.925, 0.95, 0.975, 0.99).  The  equations  took  the  form 


(5) 


where  k  denotes  the  order  of  the  polynomial  used  (we  eventually  arrived  at  =  4).  This  form  was 
determined  via  a  standard  regression  approach.  A  second  order  model  was  actually  enough  to  remove 
all  lack-of-fit;  however,  adding  the  third  and  fourth  order  terms  provided  enormous  improvements  in 
accuracy  (accuracy  is  discussed  below).  It  is  well  known  that  a  Laurent  series  expansion  can  sometimes 
lead  to  a  more  parsimonious  expression  with  fewer  coefficients.  While  a  second  order  Laurent  series 
expansion  provided  similar  regression  results,  the  accuracy  of  those  equations  proved  to  be  much  worse 
than  the  fourth  order  model  (5),  not  to  mention  the  fact  that  the  Laurent  series  expansion  required  just 
as  many  coefficients. 


ASSESSING  THE  ACCURACY  OF  THE  APPROXIMATIONS 


To  evaluate  the  accuracy  of  the  fitted  equations,  100  of  the  original  9357  points  were  chosen  at  random 
(subject  to  some  restrictions  requiring  that  more  points  be  chosen  from  the  region  of  the  smaller  sample 
sizes,  due  to  the  fact  that  the  region  of  the  smaller  sample  sizes  is  harder  to  fit).  At  each  of  these  100 
points,  at  least  1, 000, 000  simulations  were  performed  (again,  more  simulations  were  performed  for  higher 
dimensions).  Each  point  was  replicated  10  times  in  order  to  provide  an  estimate  of  variability  at  each 
of  the  100  points.  As  an  example,  one  of  the  100  random  points  chosen  was  (ni,n2,n3)  =  (8, 12,29). 
For  p  =  1,  1,000,000  simulations  of  performed  with  (ni,n2,n3)  =  (8,12,29),  The  9  sample 

quantiles  0.8, 0.825,..., 0.95, 0.975, 0.99  were  recorded.  This  gave  estimates  of  the  true  quantiles,  call 
these  estimates  8, 12, 29,  l)(i),  ol  =  0.2, 0.175, . . . ,  0.025, 0.01.  We  repeat  this  9  more  times  to  get 

rTOa*(a,8,12,29,l)(2),...,r^aj,(a,8,12,29,l)(io),  a  =  0.2, 0.175,..., 0.025, 0.01.  Then  (use  a  =  0.2  for 
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Figure  4:  Histogram  of  3*(standard  deviation  of  “exact”)  -f  |fitted- “exact” |.  Overlaid  on  the  plot  are  the  10 
largest  such  values. 


example)  we  use  the  average  of  8, 12, 29,  l)(i), . . . , 8, 12, 29,  l)(io)  as  our  estimate  of 

the  true  quantile  (we  will  refer  to  this  average  as  the  “exact”  quantile).  We  use  the  standard  deviation 
of  r^a3,(0.2, 8, 12, 29,  l)(i), . . . , T^aa,(0.2, 8, 12, 29,  l)(io)  divided  by  v/TO  as  an  estimate  of  the  standard 
deviation  of  the  “exact”  quantiles. 

The  increase  in  the  number  of  simulations  per  point  with  the  increase  in  dimension  was  adequate  to 
keep  these  standard  deviations  constant  across  dimensions.  And,  again  for  brevity,  a  histogram  of  the 
standard  deviations  of  the  “exact”  quantiles  for  all  4500  validation  points  (i.e.  100  points  x  9  quantiles 
X  5  dimensions)  is  given  in  Figure  2.  The  largest  standard  deviation  is  0.002853,  but  it  is  clear  from  the 
histogram  that  performance  is  generally  much  better  than  that.  These  “exact”  estimates  will  be  referred 
to  as  the  validation  runs. 

To  estimate  the  accuracy  of  the  fitted  equations,  we  calculate  fitted- “exact”  for  each  of  the  100 
validation  points.  A  histogram  of  the  absolute  values  of  these  results  is  given  in  Figure  3.  The  largest 
absolute  error  (i.e.  |fitted  -  exact |)  is  0.0083.  The  absolute  errors  rarely  exceed  0.006,  which  can  also  be 
observed  from  the  figure.  In  fact,  the  absolute  errors  never  exceed  0.006  for  p  =  1  and  p  =  2.  To  be  very 
conservative,  if  we  believed  that  the  true  quantile  were  as  far  as  ±3  standard  deviations  from  the  “exact” 
quantile,  then  the  furthest  our  fitted  values  could  be  from  the  truth  would  be  S*(standard  deviation  of 
“exact”)  -h  I  fitted- “exact”  \.  A  histogram  of  these  values  is  given  in  figure  4.  The  largest  such  value  is 
0.0151,  but  the  performance  is  clearly  better  than  that  in  general. 

We  may  use  the  same  analysis  as  in  the  preceding  paragraph  on  the  results  Siotani  (1992)  reports. 
Due  to  his  definitions,  we  must  rescale  what  he  reports  by  \/2  (the  rescaling  is  clearly  indicated  in  the 
table  below).  Nevertheless,  a  direct  comparison  between  his  results  and  our  results  is  appropriate.  The 
worst  case  he  does  report  is  for  p  =  5  and  a  =  0.01;  we  reproduce  this  case  here  (the  first  four  columns 
here  are  taken  from  Siotani’s  Table  1  and  Table  3). 


u 

fitted 

“exact” 

s.d.(  “exact”) 

fitted- “exact” 

3.^  + 

10 

14.359 

14.31 

0.297 

0.049 

0.6647 

20 

8.340 

8.339 

0.077 

0.001 

0.1641 

60 

6.551 

6.548 

0.051 

0.003 

0.1103 

V2 


The  0.6647  value  in  the  last  column  of  this  table  compares  directly  to  our  “worst  case”  value  of  0,0151. 
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Giving  Siotani  more  than  the  benefit  of  the  doubt,  0.1641  is  his  second  worst  value  which  is  still  more 
than  10  times  larger  than  our  worst  value  of  0.0151. 

COMPUTING  PLATFORM  AND  OTHER  SOFTWARE 

The  far  majority  of  the  computing  for  these  simulations  was  done  on  2  computers,  named  student 
and  gauss.  Student  is  a  Silicon  Graphics  Indigo  2  High  Impact  Workstation,  and  Gauss  is  a  Pentium  Pro 
150  running  FreeBSD  2.2.  The  running  times  were  roughly  3  weeks  for  the  validation  runs  and  roughly 
2.5  weeks  for  the  model  building  runs.  The  computer  code  for  the  simulations  consisted  of  roughly  300 
lines  of  FORTRAN.  The  code  calls  a  solve  routine  from  LAPACK,  as  well  as  a  few  basic  routines  from 
the  BLAS.  The  45  separate  regressions  were  carried  out  using  ordinary  proc  reg  in  SAS,  while  diagnostic 
plots  for  the  regressions  (as  well  all  of  the  plots  in  this  paper)  were  done  in  Splus.  The  code  used  for 
the  random  number  generators  was  written  by  Barry  Brown,  James  Lovato,  and  Kathy  Russell.  It  is  all 
contained  in  the  “randlib.f”  library.  The  latest  version  (version  1.3)  of  the  library  is  now  available  at 
http:// odin.  mdacc,  tmc,  edu/anonftp / source,  html 
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